Skip to main content
The Collection app is a powerful infrastructure component in the Prisme.ai ecosystem that provides simplified access to a document database. It enables you to store, retrieve, and manage structured data without the complexity of setting up and maintaining a separate database system.

Overview

Collection serves as a managed database service integrated directly into the Prisme.ai platform:

Data Storage

Store structured data in document collections

Query Capabilities

Retrieve and filter data with powerful query options

Data Management

Create, update, and delete records with simple operations

Integration Ready

Seamlessly connect with automations and workflows
This infrastructure app is particularly valuable for persistently storing information, managing application state, and building data-driven automations without external database dependencies.

Release Note: Collection with PostgreSQL

With this new release, Collection instructions become compatible with postgresql database and the same MongoDB queries / updates syntax as usual.
This also brings performance improvements and new features, but with a few breaking changes and mongodb/postgres subtle differences to note.

Schema Configuration

Collections now require explicit schema definitions in the application config.

Cross-Database Compatibility

Collections can now work across MongoDB and PostgreSQL, with some differences detailed below.

Breaking Changes

Some MongoDB options were deprecated or modified. See below for full details.

New Features

Support for new aggregate instruction across both MongoDB and PostgreSQL. Existing distinct improved with new features

Breaking Changes

These breaking changes may affect your existing MongoDB-based Collections.
Please update your configuration and code accordingly.
1

Connection Pooling

  • Since mongodb requests were previously executed by prismeai-functions microservice, the total number of mongodb clients opened were directly tied to the prismeai-functions replicas number & scaling
  • Now that collections clients are only opened from prismeai-runtime (which supports multithreading), minPoolSize and maxPoolSize must be scaled relative to RUNNER_MAX_THREADS.
  • Example: if RUNNER_MAX_THREADS=2, then divide your pool sizes by 2.
2

Required permissions

  • The new collection schemas enforcement now requires 3 more mongodb user permissions (& their equivalent SQL) :
  • listCollections, listIndexes, dropIndexes
3

Update Instructions

  • Collection.updateMany no longer supports options.upsert
  • Collection.updateOne deprecates options.upsert : Use the dedicated Collection.upsert instruction instead.
  • Collection.updateOne and Collection.updateMany no longer supports options.replace.
4

Allowed Update Operators

Only the following MongoDB update operators are supported: $push, $set, $inc, $addToSet, $pull
5

Safer Deletes and Updates

  • updateOne, updateMany, deleteOne and deleteMany now raise an error if the query is empty/undefined.
  • To allow matching all documents, use the overrideSecurity: true parameter.
6

Schema Enforcement

  • All collections must now define collectionName and properties inside their app config, see an example config below.
  • Queries referencing unknown fields will raise errors.
  • Insert/update operations with unknown fields will also fail.
  • All properties defined in the schema will now be initialized to null and returned as-is by find if they were not initialized (and nullable)

πŸ†• New Features

  • Aggregate
  • Distinct
Use aggregate instruction to easily calculate sums, average or count optionally groupped by a column, on both MongoDB and PostgreSQL collections.

Key Features

  • Document Storage
  • Query Capabilities
  • Data Manipulation
  • Advanced Features
Store flexible document structures:
  • Schema enforcement: Ensure your data respect a predefined schema
  • Nested Data Support: Store complex, hierarchical data
  • Data Types: Support for strings, numbers, booleans, arrays, objects, dates
  • Automatic Indexing: Optimized for fast retrieval
This flexible storage model accommodates a wide range of data needs.

How Collection Works

Collection provides a MongoDB-compatible interface integrated directly into the Prisme.ai platform:
1

Collections Organization

Data is organized into collections, similar to tables in relational databases:
  • Each collection contains related documents
  • Collections are created automatically when used
  • No schema definition is required
  • Each workspace has its own collection namespace
2

Document Structure

Documents are stored as JSON-like objects:
  • Each document has a unique _id field
  • Documents can have any structure
  • Fields can contain various data types
  • Nested objects and arrays are supported
  • Documents in the same collection can have different structures
3

Data Operations

Operations are performed through simple, intuitive methods:
  • Commands follow MongoDB syntax and patterns
  • Results are returned in standard formats
  • Operations are executed in a secure environment
  • Performance is optimized for common use cases
4

Integration

Collection integrates with the rest of the Prisme.ai ecosystem:
  • Direct usage in automations
  • Connection to AI agents through tools
  • Data exchange with other platform components
  • Role-based access control
This approach provides the power of a document database with the simplicity of a fully managed service.

βš–οΈ MongoDB vs PostgreSQL Differences

  • $in Operator
  • NULL vs $ne
  • Array Queries
  • Array order on updates
  • Nested JSON Array Queries
  • Upserts
  • MongoDB: $in: [] matches nothing.
  • PostgreSQL: $in: [] matches everything!\
    Always check your input array before running $in queries on PostgreSQL.

πŸ› οΈ Configuring a Collection Schema

Defining a schema is now mandatory for collections in order to enforce validation and ensure cross-database compatibility.
slug: Messages
config:
  collectionName: Messages
  indexes:
    - properties: children
    - properties:
        - conversationId
        - from.id
  uniques:
    - properties: conversationId
  properties:
    conversationId:
      type: text
      nullable: false
    children:
      type: text
      nullable: false
    content:
      type: text
      nullable: true
    from:
      type: json
      nullable: false
    tags: 
      type: array
      nullable: false
    messagesCount:
      type: number
  • The array type only supports text arrays
  • Use json type for both objects and arrays of objects
  • Non-nullable properties must be set on inserts and upserts. Otherwise the given query will fail
Available property types include: string, text, date, time, datetime, number, double, float, integer, decimal, boolean, uint8array, array, enum, enumArray, json, blob, time

Basic Operations

Let’s explore the core operations you can perform with Collection:
Add documents to a collection:
# Insert a single document
- Collection.insert:
    data:
      name: "John Doe"
      email: "john.doe@example.com"
      age: 30
      active: true
    output: result

# Insert multiple documents
- Collection.insertMany:
    data:
      - name: "Jane Smith"
        email: "jane.smith@example.com"
        age: 28
      - name: "Bob Johnson"
        email: "bob.johnson@example.com"
        age: 35
    output: result
The operation returns information about the inserted documents, including their assigned _id values.
Retrieve documents from a collection:
# Find documents matching criteria
- Collection.find:
    query:
      age: { $gt: 25 }
    output: users

# Find a single document
- Collection.findOne:
    query:
      email: "john.doe@example.com"
    output: user

# Find + sort + pagination
- Collection.find:
    query:
      active: true
    sort:
      age: -1  # Descending
    options:
      limit: 10
      skip: 0
    output: activeUsers
These operations allow you to retrieve documents with precise filtering and control over the results.
By default, updates use mongo $set operator to only update given fields without removing other fields already existing in the matched record :
# Update a single document
- Collection.updateOne:
    query:
      email: "john.doe@example.com"
    data:
      age: 31
      lastUpdated: "{{run.date}}"
    output: updateResult
But you can also use the MongoDB operators you want :
# Update a single document
- Collection.updateOne:
    query:
      email: "john.doe@example.com"
    data:
      $set:
        age: 31
        lastUpdated: "{% now() %}"
    output: updateResult

# Update multiple documents
- Collection.updateMany:
    query:
      active: false
    data:
      $set:
        status: "inactive"
        lastChecked: "{{run.date}}"
    output: updateResult

# Update with advanced operators
- Collection.updateOne:
    query:
      _id: "{{userId}}"
    data:
      $inc:
        loginCount: 1
      $push:
        loginHistory:
          date: "{{run.date}}"
          ip: "{{userIp}}"
    output: updateResult
These operations enable precise updates to documents, including field modifications, additions, and array operations.
Upserts allow you to create or update a document if it already exists, based on a list of properties (onConflictFields) that must stay unique accros the collection :
- Collection.upsert:
    data:
      type: city
      name: Toulouse
    options:
      onConflictFields:
        - name
      onInsertValues:
        createdAt: '...'
    output: upsert
onInsertValues is optional and let you specify data that will be only included upon document creation but not on update.
With PostgresSQL, a compound unique index is required for all onConflictFields.
Remove documents from a collection:
# Delete a single document
- Collection.deleteOne:
    query:
      email: "john.doe@example.com"
    output: deleteResult

# Delete multiple documents
- Collection.deleteMany:
    query:
      active: false
      lastLogin: { $lt: "{% dateAdd('now', -90, 'days') %}" }
    output: deleteResult
These operations allow you to remove documents based on specific criteria.
Easily retrieve all distinct values for a column with their counts and sorting :
# Aggregate data
- Collection.distinct:
    query:
      projectId: '{{projectId}}'
      userDocument:
        $ne: true
      createdAt:
        $gt: '{{dateStart}}'
        $lt: '{{dateEnd}}'
    field: 'tags'
    opts:
      count: true
      valueField: distinctTags
      sort:
        count: -1
    output: res    
Perform complex data analysis:
# Aggregate data
- Collection.aggregate:
    query:
      projectId: '{{projectId}}'
      isArchived:
        $ne: true
      createdAt:
        $gt: '{{dateStart}}'
        $lt: '{{dateEnd}}'
    opts:
      groupBy: department
      groupField: departmentGroup
      sort:
        count: -1
    steps:
      - inputField: size
        type: sum
        outputField: totalSize
      - inputField: age
        type: avg
        outputField: avgAge     
      - inputField: _id
        type: count
        outputField: count
    output: res    
Pagination is enforced by returning only the first 50 matching entries by default. This number is configurable with options.limit.
You can then choose which page you are interested in using options.page, starting at 1 for the first page :
# First page :  
- Collection.find:
    query: {}
    options:
      limit: 50
      page: 1
    output: firstPage
- Collection.find:
    query: {}
    options:
      limit: 50
      page: 2
    output: secondPage        
Alternatively, you can use options.skip to finely select the matching page :
# Retrieve from the 11th to the 51th  record
- Collection.find:
    query: {}
    options:
      limit: 50
      skip: 10
    output: firstPage
You can also limit the document properties you will have in return :
- Collection.find:
    query: {}
    options:
      fields:
        - projectId
        - userDocument
        - createdAt

Advanced Features

Collection includes several advanced features that enable sophisticated data management:

Indexing

Create indexes to optimize query performance:
  • Single-field indexes
  • Compound indexes
  • Text indexes for full-text search
  • Unique indexes for constraint enforcement

Transactions

Ensure data consistency with multi-document transactions:
  • Atomic operations across multiple documents
  • Rollback on error
  • Consistent reads within a transaction
  • Isolation levels

Geospatial

Store and query location data:
  • GeoJSON format support
  • Proximity queries
  • Geospatial indexing
  • Area containment queries

Schema Validation

Optional schema validation for data consistency:
  • JSON Schema validation
  • Custom validation rules
  • Validation actions (error or warning)
  • Field restriction
These advanced features provide additional capabilities for specific use cases and requirements.

Common Use Cases

Collection enables a wide range of use cases:

User Management

Store and manage user information:
  • User profiles
  • Preferences
  • Activity history
  • Authentication data

Content Management

Manage structured content:
  • Articles and posts
  • Product information
  • Media metadata
  • Categorization and tagging

Workflow State

Track process and workflow state:
  • Status tracking
  • Approval flows
  • Stage information
  • Audit history

Data Collection

Collect and store form submissions:
  • Survey responses
  • Application data
  • Contact requests
  • Registration information

Integration with Prisme.ai Products

Collection works seamlessly with other Prisme.ai products:
  • AI Knowledge
  • AI Builder
  • Custom Code
  • API Integrations
Enhance knowledge bases with Collection:
  • Store metadata about knowledge base documents
  • Track usage patterns and popular queries
  • Maintain user feedback on responses
  • Save and manage test results
This integration improves knowledge management and quality assurance.

Example: Contact Management System

Here’s an example of using Collection to build a contact management system:
1

Define Data Structure

Plan your data organization:
  • Contacts collection for individual contacts
  • Companies collection for organization information
  • Interactions collection for communication history
  • Tags collection for categorization
2

Create Storage Operations

Implement data storage automations:
# Add a new contact
slug: add-contact
name: Add Contact
do:
  - Collection.insert:
      data:
        firstName: "{{payload.firstName}}"
        lastName: "{{payload.lastName}}"
        email: "{{payload.email}}"
        phone: "{{payload.phone}}"
        company: "{{payload.company}}"
        title: "{{payload.title}}"
        tags: "{{payload.tags}}"
        createdAt: "{% now() %}"
      output: result
  - emit:
      event: contact-added
      payload:
        contact: "{{result}}"
3

Implement Query Operations

Create data retrieval automations:
# Search for contacts
slug: search-contacts
name: Search Contacts
do:
  - set:
      name: query
      value: {}
  - conditions:
      '{{payload.searchTerm}}':
        - set:
            name: query
            value:
              $or:
                - firstName: { $regex: "{{payload.searchTerm}}", $options: "i" }
                - lastName: { $regex: "{{payload.searchTerm}}", $options: "i" }
                - email: { $regex: "{{payload.searchTerm}}", $options: "i" }
                - company: { $regex: "{{payload.searchTerm}}", $options: "i" }
      '{{payload.tags}}':
        - set:
            name: query.tags
            value: { $in: "{{payload.tags}}" }
      default: []
  - Collection.find:
      query: "{{query}}"
      sort:
        lastName: 1
        firstName: 1
      options:
        limit: "{{payload.limit || 20}}"
        skip: "{{payload.skip || 0}}"
      output: contacts
  - emit:
      event: search-results
      payload:
        contacts: "{{contacts}}"
        query: "{{query}}"
4

Create User Interface

Build a UI to interact with your data:
  • Contact list view
  • Contact detail view
  • Add/edit contact forms
  • Search and filtering
5

Implement Business Logic

Add specialized functionality:
  • Duplicate detection
  • Contact merging
  • Import/export capabilities
  • Notification system
This example demonstrates how Collection can serve as the data layer for a complete application.

Best Practices

Follow these recommendations to get the most from Collection:
Design your data structure effectively:
  • Use descriptive collection names
  • Choose between embedding and referencing based on access patterns
  • Keep document size reasonable (under 1MB when possible)
  • Normalize data when it changes frequently
  • Denormalize data to optimize common queries
  • Use consistent field names across collections
Effective data modeling improves performance and maintainability.
Optimize your queries for better performance:
  • Create indexes for frequently queried fields
  • Write specific queries that use indexes
  • Limit the number of documents returned
  • Avoid complex regex patterns when possible
  • Use aggregation for data processing, not application code
These practices ensure efficient data retrieval.
Ensure data quality and consistency:
  • Validate input data before storage
  • Consider using schema validation for critical collections
  • Implement application-level validation for complex rules
  • Use unique indexes to prevent duplicates
  • Include creation and update timestamps
  • Maintain audit trails for sensitive data
Validation helps maintain data integrity.
Protect your data with proper security practices:
  • Apply proper access controls
  • Validate input to prevent injection attacks
  • Don’t store sensitive data without encryption
  • Implement appropriate backup strategies
  • Audit access to sensitive collections
  • Follow the principle of least privilege
Security should be a fundamental consideration in all data operations.

Limitations and Considerations

When using Collection, be aware of these considerations:
  • Document Size: Individual documents are limited to 16MB
  • Nested Depth: Deep nesting of objects can impact performance
  • Query Complexity: Very complex queries may have performance implications
  • Transaction Limits: Transactions have time and size limitations
  • Indexing Overhead: Indexes improve query performance but increase storage requirements and write overhead
  • Consistency Model: Collection uses an eventually consistent model in some scenarios

Next Steps

⌘I