Rate Limits

Prisme.ai implements rate limiting to ensure platform stability and fair usage across all users. This page explains the rate limits in place, how to monitor your usage, and best practices for working within these limits.

Rate Limit Overview

Most API endpoints have rate limits applied based on:

User or API Key: Limits are tracked per authenticated user or API key
Endpoint category: Different endpoint categories have different limits
Workspace: Some limits are applied per workspace

Standard APIs

Most API endpoints: 100 requests per minute

Write Operations

Create/update operations: 60 requests per minute

Search/List

Search and list operations: 30 requests per minute

These are general guidelines. Specific endpoints may have custom rate limits based on their resource intensity.

Understanding Runtime Rate Limits

Rate Limit Scoping

Burst Rate

Rate Limit Distribution

Rate limits are applied at different scopes:

Sequential execution on a single instance:

slug: thisWillBeThrottled
do:
  - repeat:
      on: 2000
      do:
        - callSomeOtherAutomation: {}

This automation will be throttled to the single-instance limit because all operations run on the same instance.

Distributed execution across instances:

do:
  - repeat:
      on: 2000
      do:
        - emit:
            event: triggerSomeOtherAutomation
            payload: {}

---
# The second automation:  

slug: callSomeOtherAutomation
when:
  events:
    - triggerSomeOtherAutomation
do: []

This approach can leverage multiple instances since events can be processed by any available instance in the cluster.

Rate Limit Headers

When making API requests, rate limit information is returned in the response headers:

X-RateLimit-Limit

The maximum number of requests allowed in the current time window

X-RateLimit-Remaining

The number of requests remaining in the current time window

X-RateLimit-Reset

The time when the current rate limit window resets, in Unix epoch seconds

Retry-After

Present only when rate limited, indicates seconds to wait before retrying

Rate Limit Response

When you exceed a rate limit, the API returns a 429 “Too Many Requests” response with details about the limit:

{
  "error": {
    "code": "rate_limit_exceeded",
    "message": "Rate limit exceeded. Please retry after 30 seconds.",
    "details": {
      "limit": 100,
      "period": "60s",
      "retryAfter": 30
    },
    "requestId": "req-1234567890abcdef"
  }
}

For Runtime automations, a payload.throttled field in the runtime.automations.executed event indicates the throttling duration.

Configuration Options

Rate limits can be configured globally using environment variables:

Environment Variable	Description	Default Value
`RATE_LIMIT_AUTOMATIONS`	Automations per second	100
`RATE_LIMIT_EMITS`	Event emits per second	30
`RATE_LIMIT_FETCHS`	HTTP fetches per second	50
`RATE_LIMIT_REPEATS`	Repeat iterations per second	1000
`RATE_LIMIT_AUTOMATIONS_BURST`	Automations burst limit	400
`RATE_LIMIT_EMITS_BURST`	Event emits burst limit	100
`RATE_LIMIT_FETCHS_BURST`	HTTP fetches burst limit	200
`RATE_LIMIT_REPEATS_BURST`	Repeat iterations burst limit	4000
`RATE_LIMIT_DISABLED`	Disable all rate limits	false

Setting any of these environment variables to 0 disables the corresponding rate limit for all workspaces.

Rate limits can be configured globally using environment variables:

Environment Variable	Description	Default Value
`RATE_LIMIT_AUTOMATIONS`	Automations per second	100
`RATE_LIMIT_EMITS`	Event emits per second	30
`RATE_LIMIT_FETCHS`	HTTP fetches per second	50
`RATE_LIMIT_REPEATS`	Repeat iterations per second	1000
`RATE_LIMIT_AUTOMATIONS_BURST`	Automations burst limit	400
`RATE_LIMIT_EMITS_BURST`	Event emits burst limit	100
`RATE_LIMIT_FETCHS_BURST`	HTTP fetches burst limit	200
`RATE_LIMIT_REPEATS_BURST`	Repeat iterations burst limit	4000
`RATE_LIMIT_DISABLED`	Disable all rate limits	false

Setting any of these environment variables to 0 disables the corresponding rate limit for all workspaces.

Rate limits can be configured per workspace using workspace secrets:

Workspace Secret	Description	Default Value
`prismeai_ratelimit_automations`	Automations per second	100
`prismeai_ratelimit_emits`	Event emits per second	30
`prismeai_ratelimit_fetchs`	HTTP fetches per second	50
`prismeai_ratelimit_repeats`	Repeat iterations per second	1000
`prismeai_ratelimit_automations_burst`	Automations burst limit	400
`prismeai_ratelimit_emits_burst`	Event emits burst limit	100
`prismeai_ratelimit_fetchs_burst`	HTTP fetches burst limit	200
`prismeai_ratelimit_repeats_burst`	Repeat iterations burst limit	4000
`prismeai_ratelimit_disabled`	Disable rate limits for this workspace	false

These workspace secrets are restricted to super admins. Regular workspace admins cannot modify these values.

Best Practices

Monitor Your Usage

Track your API usage and rate limit headers to understand your consumption patterns:

function checkRateLimits(response) {
  const limit = response.headers.get('X-RateLimit-Limit');
  const remaining = response.headers.get('X-RateLimit-Remaining');
  const reset = response.headers.get('X-RateLimit-Reset');
  
  console.log(`Rate limits: ${remaining}/${limit} remaining, reset at ${new Date(reset * 1000).toLocaleTimeString()}`);
  
  // Alert if approaching limit
  if (remaining && parseInt(remaining, 10) < parseInt(limit, 10) * 0.1) {
    console.warn('Approaching rate limit!');
  }
}

Implement Backoff and Retry

When rate limited, implement exponential backoff with jitter:

async function apiCallWithRetry(url, options, maxRetries = 5) {
  let retries = 0;
  
  while (retries < maxRetries) {
    try {
      const response = await fetch(url, options);
      checkRateLimits(response);
      
      if (response.status === 429) {
        // Get retry after header or default to exponential backoff
        const retryAfter = response.headers.get('Retry-After');
        let delay;
        
        if (retryAfter) {
          delay = parseInt(retryAfter, 10) * 1000;
        } else {
          // Exponential backoff with jitter
          delay = Math.pow(2, retries) * 1000 + Math.random() * 1000;
        }
        
        console.log(`Rate limited. Retrying after ${delay}ms`);
        await new Promise(resolve => setTimeout(resolve, delay));
        retries++;
        continue;
      }
      
      return response;
    } catch (error) {
      retries++;
      if (retries >= maxRetries) throw error;
      
      // Exponential backoff for network errors
      const delay = Math.pow(2, retries) * 1000 + Math.random() * 1000;
      await new Promise(resolve => setTimeout(resolve, delay));
    }
  }
}

Optimize Automation Distribution

Design automations to distribute work effectively:

Use events to distribute processing across multiple Runtime instances
Batch operations where possible instead of making multiple single calls
Implement queuing for high-volume operations
Use parallel processing for independent operations

# Example of batched processing
do:
 - repeat:
    on: '{{workQueue}}'
    batch:
      size: 3  # Process 3 items at once
      interval: 500  # Pause 500ms between batches
    do:
      - process:
          item: '{{item}}'

# Rather than:
do:
 - repeat:
    on: '{{workQueue}}'
    do:
      - process:
          item: '{{item}}'

Cache Responses

Implement caching for frequently accessed data:

// Simple in-memory cache
const cache = new Map();

async function fetchWithCache(url, options, ttlMs = 60000) {
  const cacheKey = `${url}:${JSON.stringify(options)}`;
  
  if (cache.has(cacheKey)) {
    const { data, expiry } = cache.get(cacheKey);
    if (expiry > Date.now()) {
      return data;
    }
    cache.delete(cacheKey);
  }
  
  const response = await fetch(url, options);
  const data = await response.json();
  
  cache.set(cacheKey, {
    data,
    expiry: Date.now() + ttlMs
  });
  
  return data;
}

Monitoring Throttling

You can monitor throttling in Runtime automations through the following methods:

Each automation execution generates a runtime.automations.executed event that includes throttling information:

{
  "event": "runtime.automations.executed",
  "payload": {
    "automation": "my-automation",
    "workspace": "my-workspace",
    "duration": 1250, // Total duration in milliseconds
    "throttled": 1000, // Time spent being throttled in milliseconds
    "status": "success"
  }
}

If throttled is greater than zero, the automation was rate limited.

Each automation execution generates a runtime.automations.executed event that includes throttling information:

{
  "event": "runtime.automations.executed",
  "payload": {
    "automation": "my-automation",
    "workspace": "my-workspace",
    "duration": 1250, // Total duration in milliseconds
    "throttled": 1000, // Time spent being throttled in milliseconds
    "status": "success"
  }
}

If throttled is greater than zero, the automation was rate limited.

The Prisme.ai dashboard provides metrics on automation execution, including:

Execution counts
Average duration
Throttling rates
Error rates

Common Rate Limit Scenarios

High-Volume Data Processing

When processing large datasets, use batching and distributed processing:

Split large datasets into manageable chunks
Process chunks in parallel using events
Implement checkpointing to resume interrupted processing
Consider scheduled automations for very large datasets

User-Generated Events

For systems handling many user-triggered events:

Implement client-side throttling for UI interactions
Queue events server-side for processing
Consider debouncing or deduplicating similar events
Prioritize critical user actions in your processing queue

Integration Synchronization

When synchronizing with external systems:

Use webhooks where possible instead of polling
Implement incremental synchronization (only changed data)
Schedule large synchronization jobs during off-peak hours
Prioritize critical data for real-time sync

Scheduled Reports

For generating scheduled reports or analytics:

Pre-compute and cache common metrics
Generate reports during off-peak hours
Split large reports into smaller segments
Implement progressive loading for user interfaces

Next Steps

Security

Learn about API security best practices

Authentication

Understand authentication methods

Error Handling

Learn how to handle API errors

Introduction

AI Builder Endpoints

Rate Limit Overview

Standard APIs

Write Operations

Search/List

Standard APIs

Write Operations

Search/List

Automations Execution

Event Emits

HTTP Fetches

Repeat Loops

Understanding Runtime Rate Limits

Rate Limit Headers

X-RateLimit-Limit

X-RateLimit-Remaining

X-RateLimit-Reset

Retry-After

Rate Limit Response

Configuration Options

Best Practices

Monitoring Throttling

Common Rate Limit Scenarios

High-Volume Data Processing

User-Generated Events

Integration Synchronization

Scheduled Reports

Next Steps

Security

Authentication

Error Handling

Introduction

AI Builder Endpoints

​Rate Limit Overview

Standard APIs

Write Operations

Search/List

Standard APIs

Write Operations

Search/List

Automations Execution

Event Emits

HTTP Fetches

Repeat Loops

​Understanding Runtime Rate Limits

​Rate Limit Headers

X-RateLimit-Limit

X-RateLimit-Remaining

X-RateLimit-Reset

Retry-After

​Rate Limit Response

​Configuration Options

​Best Practices

​Monitoring Throttling

​Common Rate Limit Scenarios

High-Volume Data Processing

User-Generated Events

Integration Synchronization

Scheduled Reports

​Next Steps

Security

Authentication

Error Handling

Rate Limit Overview

Understanding Runtime Rate Limits

Rate Limit Headers

Rate Limit Response

Configuration Options

Best Practices

Monitoring Throttling

Common Rate Limit Scenarios

Next Steps