The three layers
| Layer | Question it answers | Scope | Backend | How it’s updated |
|---|---|---|---|---|
| Working memory | ”Does my agent remember what we said 5 messages ago in this same conversation?” | One conversation | Conversation database | Automatic — past turns of the current conversation are reloaded into the prompt at each turn |
| Session memory | ”Does my agent remember what we discussed in a different conversation last week?” | One user × one agent, across all their conversations | Fast in-memory session store | Automatic — at the end of each completed conversation, the platform asks an LLM to merge the new exchange into a rolling summary, which is reloaded into the next conversation’s prompt |
| Long-term memory | ”Can my agent save a specific preference I tell it to remember and recall it later by semantic search?” | One user × one agent (or shared user-wide) | Dedicated memory service (vector search) | Tool-driven — the agent calls memory_remember when something is worth keeping; memory_recall (and an automatic pre-load) brings it back later |
- Without working memory an agent is effectively stateless between turns inside a single conversation.
- Session memory is the only layer that bridges separate conversations automatically. It produces a compact LLM-written summary, not a verbatim transcript.
- Long-term memory never saves silently. The agent must call
memory_remember— that’s why “Tell the agent what’s worth remembering” in your Instructions matters.
If you inspect an agent’s raw memory config, the structure is:
session_memory is a top-level toggle, separate from types[] which holds working memory and long-term memory side by side. The three are fully independent — you can enable any combination.Choosing which layers to enable
A quick decision guide for new agents:| Agent style | Working memory | Session memory | Long-term memory |
|---|---|---|---|
| One-shot Q&A bot (each question is independent) | ✗ | ✗ | ✗ |
| Multi-turn assistant inside a single chat | ✓ | ✗ | ✗ |
| Recurring assistant the same user comes back to over days/weeks | ✓ | ✓ | optional |
| Personal assistant that should remember explicit user preferences | ✓ | ✓ | ✓ |
How long-term memory works
Long-term memory is provided by a dedicated memory service. The agent does not store memories itself: it calls three tools, and the service handles persistence and retrieval.Three tools the agent uses
When long-term memory is enabled on an agent, three tools are injected into its toolkit. The agent decides on its own when to call each one.memory_remember
Store something worth keeping for next time.Called when the user shares a preference, a fact, an ongoing instruction, or a relationship the agent should not forget.
memory_recall
Search past memories by meaning, not just keywords.Called when the agent needs context that may have come from an earlier conversation.
memory_forget
Delete a specific memory.Called when the user asks to forget something or the information is outdated.
Automatic pre-load on every user message
On every user turn, before the LLM is called, the agent runs a similarity search against the user’s stored memories using the new message as the query and injects the top relevant ones into the prompt. The agent sees them as background context and can answer without explicitly callingmemory_recall.
This means the user often does not need to remind the agent of past preferences — the agent already has them in front of it, fresh for each message.
Semantic recall via vector embeddings
Memories are not stored as plain text alone. Each memory is also embedded as a vector. When the agent callsmemory_recall("what are my dietary restrictions?"), the service:
- Embeds the query into the same vector space.
- Returns the top‑K most similar stored memories, ranked by semantic distance.
- The agent reads them and answers.
Memory types
Every memory has atype that helps the agent reason about its purpose.
| Type | What it represents | Example |
|---|---|---|
fact | A stable piece of information about the user | ”User’s company is Acme Corp.” |
preference | A like, dislike, or stylistic choice | ”Prefers concise answers over long ones.” |
relationship | People, projects, or entities the user is connected to | ”Works with Marie on the onboarding project.” |
instruction | Standing rules the agent should follow | ”Always reply in French unless asked otherwise.” |
memory_remember. You don’t need to manage types yourself.
Scoping
Memories are stored per user. They are also optionally scoped per agent:- User-only memory — visible to that user across all agents they interact with on the platform. Useful for general user preferences (“I prefer concise answers”).
- User + agent memory — visible only to one agent for that user. Useful for agent-specific context (“For this support agent, always start by checking ticket history”).
Tags
When the agent stores a memory, it can attach a small set of short lowercase keywords describing the topic — for example["python", "coding"] or ["family", "children"]. Tags help the agent group, filter, and recall related memories more precisely.
Configure long-term memory on an agent
Choose a profile that supports it
Long-term memory is available on the Full Agent and Orchestrator profiles. Simpler profiles do not include the memory tools.
Set the recall budget
Choose how many memories the agent can pull in per turn (default: 50). Higher values give the agent more context but consume more of the prompt window.
Tell the agent what's worth remembering
Add explicit guidance in the agent’s Instructions, for example:Without explicit guidance, an agent’s use of memory tends to be inconsistent.
Forgetting
Three mechanisms remove memories:- Explicit forget — the agent calls
memory_forget(memory_id)when the user asks (“forget what I told you about X”). - User-driven cleanup — administrators can wipe a user’s memories on request, supporting GDPR-style “right to be forgotten” workflows.
- Retention policy — your workspace’s general data retention settings also apply to memories.
Privacy and security
Long-term memory is stored in a dedicated, secured service. Access is scoped to the user the memory belongs to: an agent only sees memories of the user it is currently talking to, and never memories of other users on the platform. You can review and clean a user’s memories through the platform’s admin tools, and the same retention and privacy policies that apply to conversations also apply to long-term memories.When long-term memory is the wrong tool
Long-term memory is designed for stable, user-specific information the agent should carry across conversations. It is not the right place for:| Use case | Use instead |
|---|---|
| Knowledge an agent should have about a domain (product manuals, policies) | A knowledge base / RAG, not memory |
| Remembering the previous turns of the current conversation | Working memory — enable it on the agent |
| Carrying a rolling summary from one conversation to the next, automatically | Session memory — enable it on the agent |
| Logs, audit trails, analytics | The platform’s event system |
| Large structured data (tables, full documents) | A collection or external database |
Next steps
Capabilities
See which capabilities each agent profile includes
Settings
Configure retention, sharing, and memory limits
Instructions
Write effective guidance, including memory rules
Analytics
Inspect tool calls, including memory_remember and memory_recall