When an agent uses retrieval-augmented generation (RAG), three things matter to the agent author: which vector stores the agent can read, what happens when a user uploads a file mid-conversation, and what to do when you want to change the embedding model. This page covers all three.Documentation Index
Fetch the complete documentation index at: https://docs.prisme.ai/llms.txt
Use this file to discover all available pages before exploring further.
Mental model: three kinds of vector stores
Under the hood, everything that powers RAG in Prisme.ai is a vector store — a named container with an embedding model, a vector dimensionality, and a physical index on a vector provider (Elasticsearch or OpenSearch). What differs is who owns it and how it’s referenced from the agent.| Kind | Owner | Auto-created | Visible to other agents |
|---|---|---|---|
| Knowledge base | A user | No — you create it explicitly in Knowledges | If shared via bindings |
| Conversation file search | An agent | Yes — on the first file uploaded into any of that agent’s conversations | No |
| Shared knowledge base | A user, shared with others | No | Yes, via RBAC bindings (reader / editor / admin) |
user_id, by agent_id, or made visible to additional principals via bindings.
Attaching knowledge bases to an agent
You can attach as many knowledge bases as you want to a single agent. Each one becomes a separate tool the LLM can call, and the model reads each tool’s description to decide which store to query. See Capabilities → Knowledge Bases for the click-path. Two things to keep in mind when attaching more than one:- Disambiguate via the description, not the display name. The LLM picks tools by reading their description text. For two KBs with overlapping topics, write descriptions like “search the public product manual” vs “search internal engineering notes” so the model knows which one applies.
- Agentic RAG kicks in on Full Agent and Orchestrator profiles. The ReAct loop can call the same RAG tool multiple times in one turn — different queries, refinements, follow-ups. Chunk-level deduplication (see Runtime Safeguards) prevents the model from re-reading the same passage twice in a conversation.
Conversation file search — what happens when a user uploads a file
When a user drops a file into the chat:- The agent looks for an existing
conversation_file_searchtool in its capabilities - If none exists, the platform automatically creates a vector store dedicated to this agent (named
"<Agent Name> Conversations", owned by the agent) and adds theconversation_file_searchtool to the agent’s capabilities - The uploaded file is indexed into that store
- Every later conversation with the same agent reuses the same store — uploads accumulate across conversations
- One conversation vector store per agent, not per conversation
- The conversation boundary is enforced at query time with a filter on
conversation_id— when the user in conversation A asks the agent to search the file they just uploaded, the search only returns chunks from files uploaded in conversation A, never from conversation B - Removing the
conversation_file_searchcapability from an agent does not delete the underlying vector store; the next upload re-adds the capability and re-uses the existing store
This means conversation files persist beyond the conversation they were uploaded in (at the storage level), but they remain invisible to other conversations because of the query-time filter. This is by design: it lets you reactivate a conversation and have its attachments still searchable, while preventing cross-conversation leakage.
Scoping: knowledge bases vs conversation stores
For knowledge bases, sharing is controlled through the standard Private / Organization / Public visibility levels plus the per-KB Sharing tab — see Knowledges → Sharing for the full model. The case worth calling out here is the one that doesn’t exist in Knowledges: an agent’sconversation_file_search store is always agent-scoped. It has no Sharing tab, no visibility level, and is never readable from any other agent — even within the same org and by an admin. The only ways to reach its content are (a) the owning agent calling its conversation_file_search tool, or (b) deleting it through admin tooling. This is enforced at the storage layer by the agent_id field on the vector store record.
Changing the embedding model — the A/B pattern
Every vector store records its embedding model and dimensions at creation time and physically allocates its provider index for those exact dimensions. This is a property of the vector index itself, not a Prisme.ai restriction — a 1536-dimension index physically cannot store 3072-dimension vectors. As a result, you cannot switch a live vector store to a different embedding model or change its dimensions in place. “Switching to a new embedding model” therefore means creating a new vector store and migrating what you want to keep. The platform supports this with a side-by-side pattern that lets you compare quality before committing.Create a new knowledge base with the new model
In Knowledges, create a new KB. In RAG Settings, choose the new embedding model. Re-upload (or re-crawl) the source documents into this new KB.
Attach both KBs to a test agent
Clone the production agent (or create a test variant). Add both the old KB and the new KB as capabilities, with descriptions that make it explicit which is which — for example “v1 corpus (legacy embedding)” and “v2 corpus (new embedding)”. The agent can now query either store on demand.
Run a comparison harness
Pick a list of representative user queries. Run them against the test agent, capturing which store the LLM picks and how good the answer is. The Playground and Evaluations let you script this for repeatable A/B comparison.
Does this affect conversation files?
No. Theconversation_file_search store is fully independent of any knowledge base. It has its own embedding model, frozen at the moment the agent first received a file upload. Changing the embedding on a knowledge base does not touch conversation files, and migrating conversation files does not touch knowledge bases.
If you also want to migrate the conversation store to a new embedding model, the path is heavier:
- Detach the
conversation_file_searchcapability from the agent — the underlying store is preserved, just hidden - The next user upload will re-create a fresh
conversation_file_searchstore with the current default embedding model - The previous store can be deleted manually once you no longer need its historical conversations
Costs to consider
Creating a new vector store is not free:- Every chunk re-ingested incurs an embedding API call — count chunks × your model’s per-token price
- The provider index consumes storage proportional to
chunk_count × dimension_count - During an A/B comparison you temporarily hold two copies of the corpus
Related
Capabilities
How to add knowledge bases and other capabilities to an agent
RAG Settings
Chunking, embedding model choice, retrieval tuning
Evaluations
Run repeatable comparisons between two RAG configurations
Runtime Safeguards
Chunk dedup, budgets, loop limits that frame agentic RAG