Example of Platform Sizing

This sizing depends on your specific use case, whether it’s full automation or chat mode. We strongly recommend conducting your own load testing tailored to your specific infrastructure and use cases.

Target Performance Metrics

User Interactions

4-10 per user

Average number of interactions each user makes with the platform

First Token Response

478ms (P95)

Time to first token from LLM API (using OpenAI as reference)

Concurrent Users

100 new users/second

Platform should handle 100 new users each second under peak load

Infrastructure Components

Kubernetes Cluster

Node Configuration: 5 nodes with 8GB RAM and 4 vCPU each

Storage Systems

Capacity: 50GB Elastic File System for shared storage

Configuration: Can be shared between different environments or isolated for each

Databases

Data Types: RBAC permissions, users, application data

Configuration:

3 nodes in replica set
2GB RAM and 2 vCPU per node
1,000 IOPS

Disk Space: 10GB total storage requirement

Environment Separation:

1 “permissions” database per environment
1 “users” database per environment
1 “collections” database per environment

Version: MongoDB version 6 with path to version 7

The cluster can be shared across environments with proper database separation.

Scaling Considerations

Horizontal vs. Vertical Scaling

Resource Allocation Strategy

Environment Isolation

Monitoring Recommendations

We recommend monitoring the following metrics to ensure optimal performance:

System Metrics

CPU utilization
Memory usage
Disk I/O and latency
Network throughput

Application Metrics

Request latency
Error rates
Concurrent users
Queue lengths

Database Metrics

Query performance
Connection pool usage
Index efficiency
Replication lag

These sizing recommendations provide a starting point, but real-world performance may vary. Always conduct load testing with scenarios that reflect your actual usage patterns.

Overview

Cloud Providers

Docker & Kubernetes Deployment

Entreprise Services

AI Products

Operations

Example of Platform Sizing

Target Performance Metrics

User Interactions

First Token Response

Concurrent Users

Infrastructure Components

Kubernetes Cluster

Storage Systems

Databases

Scaling Considerations

Monitoring Recommendations

System Metrics

Application Metrics

Database Metrics

Overview

Cloud Providers

Docker & Kubernetes Deployment

Entreprise Services

AI Products

Operations

​Target Performance Metrics

User Interactions

First Token Response

Concurrent Users

​Infrastructure Components

​Kubernetes Cluster

​Storage Systems

​Databases

​Scaling Considerations

​Monitoring Recommendations

System Metrics

Application Metrics

Database Metrics

Target Performance Metrics

Infrastructure Components

Kubernetes Cluster

Storage Systems

Databases

Scaling Considerations

Monitoring Recommendations