Skip to main content
Retrieval Augmented Generation (RAG) agents combine the capabilities of foundation models — including their reasoning abilities — with access to your organization’s proprietary knowledge. By retrieving relevant information and incorporating it into responses, RAG agents provide accurate, contextual answers grounded in your specific business documents and data.

What is Retrieval Augmented Generation?

RAG is an AI architecture that enhances large language models (LLMs) by connecting them to external knowledge sources. This approach allows models to access, retrieve, and integrate specific information that may be:
  • Beyond their training data
  • Proprietary to your organization
  • Frequently updated
  • Highly specialized or technical
RAG addresses one of the key limitations of foundation models: their inability to access specific, up-to-date, or proprietary information not included in their training data.

Key Components

Knowledge Base

Processed repository of organizational documents and information

Retrieval System

Mechanism for finding relevant information based on user queries

Context Management

Processing of retrieved information for optimal use by the model

Generation System

Creation of responses that incorporate the retrieved information

When to Use RAG Agents

RAG agents are ideal when you need AI systems that can:
  • Answer questions about your specific products, services, or domain
  • Provide accurate information from internal documentation
  • Stay current with frequently updated information
  • Generate responses grounded in factual, verifiable content
  • Reduce hallucinations and fabrications in AI outputs

Benefits of RAG

Knowledge Accuracy

Responses grounded in verified organizational information

Reduced Hallucinations

Minimized fabrication of facts or details

Information Freshness

Access to the latest information, beyond model training cutoff

Source Attribution

Traceability to source documents and references

Domain Specificity

Expertise in your organization’s unique knowledge areas

Knowledge Control

Governance over what information the AI can access and use

RAG Architecture

The architecture of a RAG agent consists of several interconnected components:
1

Document Processing

Converting raw documents into a format optimized for retrieval.
Document Processing
Key processes include:
  • Text extraction from various file formats
  • Chunking of content into manageable pieces
  • Embedding generation for semantic search
  • Metadata extraction and enrichment
  • Index creation and optimization
2

Query Processing

Transforming user questions into effective retrieval queries.Important components:
  • Query understanding and classification
  • Query expansion or reformulation
  • Query embedding generation
  • Metadata filtering and constraints
  • Hybrid search strategy selection
3

Retrieval

Finding the most relevant information from the knowledge base.Retrieval approaches include:
  • Semantic search using vector embeddings
  • Keyword search for precise term matching
  • Hybrid search combining multiple strategies
  • Metadata filtering for context-specific results
  • Relevance ranking and reranking
4

Context Assembly

Organizing retrieved information for optimal use by the model.Assembly considerations:
  • Context window size limitations
  • Information relevance prioritization
  • Document structure preservation
  • Source attribution maintenance
  • Contextual framing for the model
5

Response Generation

Creating answers that incorporate the retrieved information.
Response Generation
Generation aspects:
  • Integration of retrieved information
  • Citation of sources
  • Handling of conflicting information
  • Management of uncertainty
  • Response formatting and structure

Example Use Cases

  • Technical Support
  • Policy Guidance
  • Research Assistant
  • Knowledge Management
Purpose: Assist users with technical questions based on product documentationKey Features:
  • Access to technical manuals and documentation
  • Ability to retrieve specific procedures and specifications
  • Troubleshooting guidance based on known issues
  • Source attribution to official documentation

Implementation Steps

Creating an effective RAG agent involves several key steps:
1

Define Knowledge Requirements

Identify what information the agent needs access to and how it will be used.Key questions to answer:
  • What specific knowledge domains should the agent cover?
  • What document types and sources will be included?
  • How often does the information change or update?
  • What level of detail is required in responses?
  • Are there security or privacy considerations for certain content?
2

Prepare Knowledge Base

Collect, process, and structure the information for effective retrieval.Essential activities:
  • Document collection and curation
  • Text extraction and preprocessing
  • Chunking strategy determination
  • Embedding model selection
  • Metadata definition and extraction
  • Index creation and optimization
3

Configure Retrieval System

Set up the mechanisms for finding relevant information.Key configurations:
  • Search strategy selection (semantic, keyword, hybrid)
  • Relevance threshold determination
  • Number of results to retrieve
  • Reranking algorithms
  • Metadata filtering rules
4

Design Context Management

Create processes for organizing and using retrieved information.Important considerations:
  • Context window management
  • Information prioritization
  • Source attribution approach
  • Handling conflicting information
  • Managing information overload
5

Develop Prompting Strategy

Create effective prompts that guide the model in using retrieved information.Strategy elements:
  • Clear instructions for information synthesis
  • Guidance on source citation
  • Handling of information gaps
  • Uncertainty management
  • Response structure and formatting
6

Test and Refine

Validate performance and iteratively improve.Testing approaches:
  • Representative query testing
  • Edge case validation
  • Retrieval quality assessment
  • Response accuracy evaluation
  • End-user usability testing

Best Practices

The quality of your knowledge base is foundational to RAG performance.Recommendations:
  • Effective chunking strategy: Balance chunk size to capture sufficient context while remaining focused
    # Example chunking configuration
    {
      "chunk_size": 1000,
      "chunk_overlap": 200
    }
    
  • Quality embedding models: Use high-dimensional embeddings for better semantic representation
  • Metadata enrichment: Add metadata to improve filtering and relevance
    # Example metadata fields
    {
      "document_type": "policy",
      "department": "finance",
      "effective_date": "2023-09-01",
      "policy_number": "FIN-2023-05",
      "tags": ["managers", "finance_team"]
    }
    
  • Regular updates: Establish processes for keeping the knowledge base current
By using the Builder with custom automation, you can fine-tune the balance between recall (finding all relevant information) and precision (avoiding irrelevant results), which is critical for optimal performance.Recommendations:
  • Hybrid search approaches: Combine semantic and keyword search for better results
    # Example of hybrid search using Tools 
     
     - Knowledge Client.listDocuments:
          page: 0
          limit: 20
          filters:
            - field: content
              type: textSearch
              value: '{{body.arguments.keywords}}'
          includeContent: true
          provider: crawler
          output: documents
          includeMetadata: true
    
  • Query reformulation: Generate multiple query versions to improve retrieval
    # Example query reformulation
    
      - Knowledge Client.chat-completion:
          messages:
            - role: assistant
              content: 'Given the following user question, generate 3 alternative phrasings that express the same intent. Return the result as a JSON object with a "reformulations" key.'
            - role: user
              content: 'Question: "{{user_question}}"'
          output: reformulations
      - set:
          name: output
          value:
            value: '{{reformulations}}'
            description: 'alternative phrasings of user question'
    
  • Reranking: Apply secondary ranking to improve result relevance
  • Context-aware filtering: Use conversation context to narrow search scope
How you organize retrieved information significantly impacts response quality.Recommendations:
  • Context format standardization: Provide retrieved information in a consistent format
    # Example context format
    DOCUMENT: [Document Title]
    SOURCE: [Source ID or URL]
    DATE: [Publication or Last Updated Date]
    SECTION: [Section Name]
    CONTENT:
    [Retrieved text]
    
    DOCUMENT: [Next Document Title]
    ...
    
  • Source prioritization: Place more relevant or authoritative sources earlier in context
  • Context instructions: Include explicit instructions for how the model should use the provided information
    # Example context instructions
    The following information has been retrieved from company documentation. 
    Use this information to answer the user's question. 
    If the information is insufficient, acknowledge the limitations of the provided context.
    Always cite the specific document and section when providing information.
    
  • Context window management: Be strategic about what fits in limited context windows
Clear instructions improve how models use retrieved information.Recommendations:
  • Source attribution requirements: Instruct on when and how to cite sources
    # Example attribution instruction
    Always cite the specific document name and section when providing information from the retrieved context.
    Use the format: "According to [Document Name], section [Section Name], ..."
    
  • Information synthesis guidance: Explain how to combine information from multiple sources
    # Example synthesis instruction
    If multiple sources provide relevant information, synthesize them into a coherent response.
    If sources conflict, acknowledge the discrepancy and clarify which source is more authoritative or recent.
    
  • Handling information gaps: Provide clear guidance for when retrieved information is insufficient
    # Example information gap handling
    If the retrieved information doesn't fully answer the user's question:
    1. Provide what information is available
    2. Clearly state what specific aspects cannot be addressed based on the available information
    3. Suggest how the user might find the missing information (e.g., specific documentation to check)
    
  • Uncertainty management: Establish how to handle varying levels of confidence

Common Challenges and Solutions

Retrieval QualitySystem fails to find relevant informationHallucinations Despite RAGModel generates inaccurate information despite retrievalContext Window LimitationsRetrieved information exceeds model’s context capacityInconsistent InformationKnowledge base contains conflicting informationKnowledge FreshnessKnowledge base becomes outdated
  • Improve chunking strategy
  • Implement hybrid search
  • Add query reformulation
  • Use better embedding models
  • Add metadata for filtering
  • Strengthen instructions about using retrieved information
  • Improve relevance of retrieved content
  • Implement fact-checking mechanisms
  • Add explicit source citation requirements
  • Lower temperature settings for generation
  • Implement smarter chunking strategies
  • Improve relevance filtering
  • Compress or summarize retrieved content
  • Consider models with larger context windows
  • Implement content governance processes
  • Add metadata for content versioning
  • Provide conflict resolution guidelines
  • Prioritize more recent or authoritative sources
  • Acknowledge conflicts in responses
  • Implement regular update processes
  • Add timestamp metadata to all content
  • Create content expiration policies
  • Integrate with content management systems
  • Monitor and flag outdated information

Advanced RAG Techniques

Once you’ve mastered basic RAG implementation, consider these advanced techniques:
Using AI Builder, implement a cascade of retrieval steps to improve both efficiency and accuracy.Example Process:
1. Initial broad retrieval: Use embeddings to find potentially relevant documents
2. Reranking: Apply more sophisticated models to rerank initial results
3. Focused retrieval: Perform targeted retrieval on the most promising documents
4. Final assembly: Combine the most relevant information for the model
This approach is particularly effective for large knowledge bases where a single retrieval step may miss important information or retrieve too much irrelevant content.
  - Knowledge Client.chat-completion:
      messages:
        - role: assistant
          content: |-
              You are an AI assistant tasked with evaluating the relevance of each document to the user's query.
              For each document, assign a relevance score from 1 (not relevant) to 5 (highly relevant), and provide a brief justification for the score.
              Return the results in JSON format, listing the documents in descending order of relevance.
              
              Documents:
              {{retrieved_documents}}

              Expected Output Format:
              {
                "reranked_documents": [
                  {
                    "document_id": "doc1",
                    "score": 5,
                    "justification": "This document directly addresses the user's query with detailed information."
                  },
                  {
                    "document_id": "doc2",
                    "score": 3,
                    "justification": "This document mentions the topic but lacks specific details."
                  },
                  {
                    "document_id": "doc3",
                    "score": 1,
                    "justification": "This document is unrelated to the user's query."
                  }
                ]
              }
        - role: user
          content: 'User Query: {{userQuery}}'
      output: reranked_documents
Improve retrieval by transforming user queries into more effective search queries.Techniques:
  • Query expansion: Add related terms to improve recall
    Original: "vacation policy"
    Expanded: "vacation policy time off leave holiday absence"
    
  • Hypothetical document embeddings: Generate embeddings for the ideal documents that would answer the query
  • Query decomposition: Break complex queries into sub-queries
    Original: "What's our policy on remote work expenses for international employees?"
    Decomposed:
    - "remote work policy"
    - "expense reimbursement policy"
    - "international employee policies"
    
  • Conversational context incorporation: Include relevant previous exchanges
Implement mechanisms for the agent to verify its own responses.Approaches:
1. Generate initial response based on retrieved information
2. Extract factual claims from the response
3. Verify each claim against the retrieved information
4. Correct or qualify statements that aren't fully supported
5. Regenerate improved response
This process can significantly reduce hallucinations and improve accuracy, particularly for complex or nuanced topics.
With AI Builder, enhance your RAG system by integrating structured knowledge representation powered by Neo4j or a similar graph database.Benefits:
  • Capture relationships between entities
  • Provide structured context for ambiguous terms
  • Enable reasoning across related concepts
  • Improve query understanding through entity resolution
Knowledge graphs can complement text retrieval by providing structured information about entities and their relationships, particularly valuable in domains with complex interconnections like product catalogs, organizational structures, or technical systems.

Implementation in Prisme.ai

Prisme.ai provides comprehensive support for RAG agent implementation through:
  • AI Knowledge
  • AI Builder
The AI Knowledge product provides a no-code interface for creating RAG agents:
AI Knowledge Interface
Key features:
  • Document processing and management
  • Knowledge base creation and organization
  • RAG configuration and optimization
  • Agent creation and deployment
  • Performance analytics and monitoring
Learn more about AI Knowledge →