Agent Memory

Prisme.ai agents have three independent memory layers. Each one answers a different question, runs on its own backend, and is toggled separately. The rest of this page focuses on long-term memory, but the table below makes the three layers explicit so you know which one to reach for.

The three layers

Layer	Question it answers	Scope	Backend	How it’s updated
Working memory	”Does my agent remember what we said 5 messages ago in this same conversation?”	One conversation	Conversation database	Automatic — past turns of the current conversation are reloaded into the prompt at each turn
Session memory	”Does my agent remember what we discussed in a different conversation last week?”	One user × one agent, across all their conversations	Fast in-memory session store	Automatic — at the end of each completed conversation, the platform asks an LLM to merge the new exchange into a rolling summary, which is reloaded into the next conversation’s prompt
Long-term memory	”Can my agent save a specific preference I tell it to remember and recall it later by semantic search?”	One user × one agent (or shared user-wide)	Dedicated memory service (vector search)	Tool-driven — the agent calls `memory_remember` when something is worth keeping; `memory_recall` (and an automatic pre-load) brings it back later

Practical consequences:

Without working memory an agent is effectively stateless between turns inside a single conversation.
Session memory is the only layer that bridges separate conversations automatically. It produces a compact LLM-written summary, not a verbatim transcript.
Long-term memory never saves silently. The agent must call memory_remember — that’s why “Tell the agent what’s worth remembering” in your Instructions matters.

If you inspect an agent’s raw memory config, the structure is:

{
  "session_memory": true,
  "types": [
    { "id": "working_memory", "enabled": true, "config": { "max_tokens": 4000, "compaction_strategy": "summarize" } },
    { "id": "long_term",      "enabled": true, "config": { "max_memories": 50 } }
  ]
}

session_memory is a top-level toggle, separate from types[] which holds working memory and long-term memory side by side. The three are fully independent — you can enable any combination.

The rest of this page focuses on long-term memory — the layer that lets your agent remember a user’s preferences from one week to the next, on explicit cue.

Choosing which layers to enable

A quick decision guide for new agents:

Agent style	Working memory	Session memory	Long-term memory
One-shot Q&A bot (each question is independent)	✗	✗	✗
Multi-turn assistant inside a single chat	✓	✗	✗
Recurring assistant the same user comes back to over days/weeks	✓	✓	optional
Personal assistant that should remember explicit user preferences	✓	✓	✓

Session memory’s summary is generated asynchronously at the end of a conversation, so enabling it doesn’t add latency to user-facing responses. The summary becomes available the next time the user opens a conversation with the same agent.

How long-term memory works

Long-term memory is provided by a dedicated memory service. The agent does not store memories itself: it calls three tools, and the service handles persistence and retrieval.

            ┌──────────────────────────────────────────────┐
            │                Conversation                  │
            │                                              │
   user ─►  │  ┌─ pre-load ──────────────────────────┐    │
            │  │ relevant past memories injected     │    │
            │  └────────────────────────────────────-┘    │
            │                  │                          │
            │                  ▼                          │
            │             [ Agent / LLM ]                 │
            │            /     │      \                   │
            │  remember()  recall()  forget()            │
            │       │          │         │                │
            └───────┼──────────┼─────────┼────────────────┘
                    ▼          ▼         ▼
                ┌──────────────────────────────┐
                │      Memory service          │
                │  (vector search + storage)   │
                └──────────────────────────────┘

Three tools the agent uses

When long-term memory is enabled on an agent, three tools are injected into its toolkit. The agent decides on its own when to call each one.

memory_remember

Store something worth keeping for next time.Called when the user shares a preference, a fact, an ongoing instruction, or a relationship the agent should not forget.

memory_recall

Search past memories by meaning, not just keywords.Called when the agent needs context that may have come from an earlier conversation.

memory_forget

Delete a specific memory.Called when the user asks to forget something or the information is outdated.

Automatic pre-load on every user message

On every user turn, before the LLM is called, the agent runs a similarity search against the user’s stored memories using the new message as the query and injects the top relevant ones into the prompt. The agent sees them as background context and can answer without explicitly calling memory_recall. This means the user often does not need to remind the agent of past preferences — the agent already has them in front of it, fresh for each message.

Semantic recall via vector embeddings

Memories are not stored as plain text alone. Each memory is also embedded as a vector. When the agent calls memory_recall("what are my dietary restrictions?"), the service:

Embeds the query into the same vector space.
Returns the top‑K most similar stored memories, ranked by semantic distance.
The agent reads them and answers.

This is why the user can ask the question in any wording — “what can’t I eat?”, “any food I avoid?” — and still get the right memories back.

Memory types

Every memory has a type that helps the agent reason about its purpose.

Type	What it represents	Example
`fact`	A stable piece of information about the user	”User’s company is Acme Corp.”
`preference`	A like, dislike, or stylistic choice	”Prefers concise answers over long ones.”
`relationship`	People, projects, or entities the user is connected to	”Works with Marie on the onboarding project.”
`instruction`	Standing rules the agent should follow	”Always reply in French unless asked otherwise.”

The agent picks the type when calling memory_remember. You don’t need to manage types yourself.

Scoping

Memories are stored per user. They are also optionally scoped per agent:

User-only memory — visible to that user across all agents they interact with on the platform. Useful for general user preferences (“I prefer concise answers”).
User + agent memory — visible only to one agent for that user. Useful for agent-specific context (“For this support agent, always start by checking ticket history”).

Two different users never see each other’s memories. Two different agents within the same user account see each other’s memories only if those memories were saved as user-only.

Configure long-term memory on an agent

Choose a profile that supports it

Long-term memory is available on the Full Agent and Orchestrator profiles. Simpler profiles do not include the memory tools.

Enable it in Settings

Go to Settings → Memory and turn on Long-term memory.

Set the recall budget

Choose how many memories the agent can pull in per turn (default: 50). Higher values give the agent more context but consume more of the prompt window.

Tell the agent what's worth remembering

Add explicit guidance in the agent’s Instructions, for example:

Memory guidelines:
- Remember the user's preferred tone, language, and reporting style.
- Remember names of people and projects they work with.
- Do not store sensitive personal data (health, finances) unless asked.
- When the user explicitly says "remember this", always store it.

Without explicit guidance, an agent’s use of memory tends to be inconsistent.

Test it across two conversations

Open a conversation, share a preference, end the conversation. Open a new one and ask something that should trigger recall. Verify the agent already has the context.

Forgetting

Three mechanisms remove memories:

Explicit forget — the agent calls memory_forget(memory_id) when the user asks (“forget what I told you about X”).
User-driven cleanup — administrators can wipe a user’s memories on request, supporting GDPR-style “right to be forgotten” workflows.
Retention policy — your workspace’s general data retention settings also apply to memories.

The agent never forgets silently: forgetting is always triggered by an action, not by passive decay.

Privacy and security

Long-term memory is stored in a dedicated, secured service. Access is scoped to the user the memory belongs to: an agent only sees memories of the user it is currently talking to, and never memories of other users on the platform. You can review and clean a user’s memories through the platform’s admin tools, and the same retention and privacy policies that apply to conversations also apply to long-term memories.

When long-term memory is the wrong tool

Long-term memory is designed for stable, user-specific information the agent should carry across conversations. It is not the right place for:

Use case	Use instead
Knowledge an agent should have about a domain (product manuals, policies)	A knowledge base / RAG, not memory
Remembering the previous turns of the current conversation	Working memory — enable it on the agent
Carrying a rolling summary from one conversation to the next, automatically	Session memory — enable it on the agent
Logs, audit trails, analytics	The platform’s event system
Large structured data (tables, full documents)	A collection or external database

A useful test: if the same content would apply to every user, it belongs in a knowledge base. If it only makes sense for one user, it belongs in long-term memory.

Next steps

Capabilities

See which capabilities each agent profile includes

Settings

Configure retention, sharing, and memory limits

Instructions

Write effective guidance, including memory rules

Analytics

Inspect tool calls, including memory_remember and memory_recall

Overview

Chat

Agent Creator

Knowledges

Builder

Governe

Insights

The three layers

Choosing which layers to enable

How long-term memory works

Three tools the agent uses

memory_remember

memory_recall

memory_forget

Automatic pre-load on every user message

Semantic recall via vector embeddings

Memory types

Scoping

Tags

Configure long-term memory on an agent

Forgetting

Privacy and security

When long-term memory is the wrong tool

Next steps

Capabilities

Settings

Instructions

Analytics

​The three layers

​Choosing which layers to enable

​How long-term memory works

​Three tools the agent uses

memory_remember

memory_recall

memory_forget

​Automatic pre-load on every user message

​Semantic recall via vector embeddings

​Memory types

​Scoping

​Tags

​Configure long-term memory on an agent

​Forgetting

​Privacy and security

​When long-term memory is the wrong tool

​Next steps

Capabilities

Settings

Instructions

Analytics

The three layers

Choosing which layers to enable

How long-term memory works

Three tools the agent uses

Automatic pre-load on every user message

Semantic recall via vector embeddings

Memory types

Scoping

Tags

Configure long-term memory on an agent

Forgetting

Privacy and security

When long-term memory is the wrong tool

Next steps