RAG Configuration
Fine-tune how AI retrieves and uses your organizational knowledge
Retrieval Augmented Generation (RAG) is the core technology that allows AI agents to leverage your organization’s knowledge & database. Prisme.ai provides two powerful approaches to customize RAG behavior: YAML-based tools and webhooks. These methods give you granular control over every aspect of the RAG pipeline.
Configuration Approaches
For basic configurations, you can use the built-in UI settings:
- Instructions
- Keep clear and precise instructions with Titles and Sections.
- Specify language (Reply in the user language, reply in English only, …)
- Tone (Friendly, empathic, formal, …)
- Output instructions (concise, detailled, …)
- Documents and tools results placed in the ${context} keyword
- Date placed in the ${date} keyword. Specify your users timezone.
- Keep clear and precise instructions with Titles and Sections.
- Text Splitter Configuration
- Chunk Size
- Chunk Overlap
- Enable override by document
- Embeddings Settings
- Number of chunks to retrieve
- Self-Query
- Enable LLM query reformulation
- Configure from AI Store input
- Query Enhancement
- Select model
- Add instructions and definitions
- Post-Processing
- Show suggested questions
- Filter displayed sources
For basic configurations, you can use the built-in UI settings:
- Instructions
- Keep clear and precise instructions with Titles and Sections.
- Specify language (Reply in the user language, reply in English only, …)
- Tone (Friendly, empathic, formal, …)
- Output instructions (concise, detailled, …)
- Documents and tools results placed in the ${context} keyword
- Date placed in the ${date} keyword. Specify your users timezone.
- Keep clear and precise instructions with Titles and Sections.
- Text Splitter Configuration
- Chunk Size
- Chunk Overlap
- Enable override by document
- Embeddings Settings
- Number of chunks to retrieve
- Self-Query
- Enable LLM query reformulation
- Configure from AI Store input
- Query Enhancement
- Select model
- Add instructions and definitions
- Post-Processing
- Show suggested questions
- Filter displayed sources
For advanced customization, create YAML-based tools to override specific parts of the RAG pipeline:
YAML tools give you complete control with the full power of AI Builder.
For external processing or integration with existing systems, configure webhooks to intercept and modify RAG behavior:
Webhooks provide maximum flexibility and integration capabilities.
RAG Pipeline Components
The RAG pipeline in Prisme.ai consists of several stages, each of which can be customized using YAML tools or webhooks:
Query Processing
Transform and enhance the user’s question before retrieval
Retrieval
Find relevant documents in your knowledge base
Context Assembly
Organize retrieved documents into a coherent context
Prompt Generation
Create the prompt that will be sent to the LLM
Response Generation
Generate the final answer using the LLM
Post-Processing
Enhance the response with additional information or formatting
YAML Tool Examples
Query Reformulation
Query Reformulation
This tool enhances user queries by generating alternative phrasings:
Usage scenario: Improve retrieval quality for ambiguous or tersely worded queries.
Web Search Tool
Web Search Tool
This tool augments the knowledge base with real-time web search results:
Usage scenario: Supplement your knowledge base with up-to-date information from the web.
Semantic Chunking
Semantic Chunking
This tool implements intelligent document chunking based on semantic boundaries:
Usage scenario: Improve retrieval quality for documents with complex structure or mixed topics.
Dynamic RAG Parameters
Dynamic RAG Parameters
This tool dynamically adjusts retrieval parameters based on query complexity:
Usage scenario: Automatically optimize retrieval for different query types (simple vs. complex, factual vs. exploratory).
Context Fusion
Context Fusion
This tool combines information from multiple retrieved chunks into a coherent context:
Usage scenario: Create more coherent context for complex queries that require information from multiple documents.
Webhook Integration
Webhooks provide an alternative approach to customizing the RAG pipeline by intercepting key events and modifying the behavior via external HTTP endpoints.
Webhook Configuration
Webhook Configuration
To configure a webhook for your AI Knowledge agent:
- Go to your agent’s settings
- Navigate to the “Webhooks” section
- Enter your HTTPS webhook URL
- Select which events to subscribe to
All webhook endpoints must use HTTPS and respond within 30 seconds to avoid timeouts.
Document Events
Document Events
Webhooks can intercept document creation, update, and deletion events:
Request Example (Document Creation):
Response Example (Modify Document):
This allows you to:
- Preprocess documents before indexing
- Add or modify metadata
- Override chunking settings per document
- Reject documents that don’t meet criteria
Query Interception
Query Interception
Webhooks can intercept user queries and modify various aspects of the RAG process:
Request Example:
Response Options:
- Override Retrieved Context:
- Override Prompt Generation:
- Override Answer Generation:
- Override AI Parameters:
- Override Search Results:
Test Results Webhook
Test Results Webhook
Webhooks can also intercept test results for analysis and evaluation:
Request Example:
Response Example:
This allows you to:
- Implement custom evaluation metrics
- Track test results in external systems
- Apply domain-specific scoring criteria
Combining YAML Tools and Webhooks
For the most sophisticated RAG configurations, you can combine YAML tools and webhooks:
Sequential Pipeline
Chain multiple YAML tools to create a sequential processing pipeline, with webhooks for external integration at key points.
Example: YAML tool for query reformulation → webhook for sensitive query detection → YAML tool for retrieval customization
Fallback Mechanisms
Configure webhooks as fallbacks when YAML tools don’t produce satisfactory results.
Example: Try native retrieval first, but if no good matches are found, call webhook to query external knowledge bases
A/B Testing
Use different YAML tools or webhooks based on query characteristics or for experimentation.
Example: Route technical questions through one pipeline and customer support questions through another
Hybrid Processing
Let webhooks handle some RAG components while YAML tools handle others.
Example: Webhook handles retrieval from proprietary databases, YAML tool handles context optimization
Best Practices
YAML Tool Development
YAML Tool Development
- Start with simple tools and incrementally add complexity
- Use a consistent naming convention for tools
- Thoroughly test tools with varied inputs
- Document each tool’s purpose and expected inputs/outputs
- Consider performance implications for complex processing
Webhook Implementation
Webhook Implementation
- Ensure webhooks are hosted on reliable, low-latency infrastructure
- Implement proper error handling and fallbacks
- Cache results when appropriate to improve response times
- Use secure authentication to protect sensitive data
- Monitor webhook performance and error rates
RAG Pipeline Design
RAG Pipeline Design
- Clearly define which components require customization
- Choose the appropriate approach (UI, YAML, webhook) based on complexity
- Test changes incrementally to isolate effects
- Monitor key metrics before and after changes
- Document your RAG pipeline configuration for maintainability