Create a chat completion

Authorizations

Authorization

string

header

required

User session JWT or instance API key (iak_*). Send as Authorization: Bearer <token>.

Body

application/json

OpenAI-compatible chat completion request. Only the fields actually accepted by the gateway are documented here.

model

string

required

Model id from the catalogue (e.g. gpt-4o, eu.anthropic.claude-sonnet-4-20250514-v1:0, vertex-gemini-2.5-flash).

Maximum string length: 256

messages

object[]

required

Conversation history (system + user/assistant/tool turns).

Hide child attributes

messages.role

enum<string>

required

Author role.

Available options:

system,

user,

assistant,

tool

messages.content

Message content. Either a plain string, or an array of typed parts (e.g. text + image) for multimodal inputs.

messages.name

string

Optional author name (e.g. function name for tool messages).

messages.tool_call_id

string

For role: tool, the id of the tool call this message responds to.

messages.tool_calls

object[]

Tool calls emitted by an assistant message.

Hide child attributes

messages.tool_calls.id

string

required

Provider-issued tool call identifier.

messages.tool_calls.type

enum<string>

required

Tool kind. Currently always function.

Available options:

function

messages.tool_calls.function

object

required

Hide child attributes

messages.tool_calls.function.name

string

required

Function name.

messages.tool_calls.function.arguments

string

JSON-encoded arguments. Sent as a string (matches OpenAI's wire format).

temperature

any

Sampling temperature (provider-dependent range, typically 0–2).

max_tokens

any

Max tokens to generate.

top_p

any

Nucleus sampling parameter.

frequency_penalty

any

OpenAI-style frequency penalty.

presence_penalty

any

OpenAI-style presence penalty.

stop

any

One or more stop sequences (string or array of strings).

stream

boolean

When true, the response is a text/event-stream of ChatCompletionChunk deltas terminating with data: [DONE].

tools

object[]

Tool/function definitions made available to the model. Forwarded to providers that support tool calling.

tool_choice

any

Tool selection hint: "auto", "none", "required", or { type: "function", function: { name } }.

response_format

object

OpenAI-style structured output hint (e.g. { "type": "json_object" }).

seed

any

Provider seed for reproducible sampling (where supported).

task_id

string

Prisme.ai extension. Opaque correlation id propagated to A2A (agent-to-agent) flows.

Maximum string length: 128

analytics_context

object

Prisme.ai extension. Caller-supplied analytics context merged into the analytics.llm.completion event.

Hide child attributes

analytics_context.orgSlug

string

analytics_context.agent_id

string

analytics_context.user_id

string

analytics_context.context_id

string

analytics_context.agent_allowed_models

string[]

analytics_context.call_type

string

analytics_context.message_turn

number

Response

Successful completion. Content type depends on request.stream:

application/json: non-streaming ChatCompletionResponse.
text/event-stream: SSE stream of ChatCompletionChunk payloads terminated by data: [DONE].

Non-streaming chat completion response. Mirrors OpenAI's shape with Prisme.ai extensions on usage (cost, duration_ms, carbon).

string

required

Generated id (chatcmpl-<correlationId>).

object

enum<string>

required

Available options:

chat.completion

created

integer

required

Unix timestamp (seconds).

model

string

required

Resolved model id used to serve the request.

choices

object[]

required

Hide child attributes

choices.index

integer

required

choices.message

object

required

A single message in a chat completion request or response.

Hide child attributes

choices.message.role

enum<string>

required

Author role.

Available options:

system,

user,

assistant,

tool

choices.message.content

Message content. Either a plain string, or an array of typed parts (e.g. text + image) for multimodal inputs.

choices.message.name

string

Optional author name (e.g. function name for tool messages).

choices.message.tool_call_id

string

For role: tool, the id of the tool call this message responds to.

choices.message.tool_calls

object[]

Tool calls emitted by an assistant message.

Hide child attributes

choices.message.tool_calls.id

string

required

Provider-issued tool call identifier.

choices.message.tool_calls.type

enum<string>

required

Tool kind. Currently always function.

Available options:

function

choices.message.tool_calls.function

object

required

Hide child attributes

choices.message.tool_calls.function.name

string

required

Function name.

choices.message.tool_calls.function.arguments

string

JSON-encoded arguments. Sent as a string (matches OpenAI's wire format).

choices.finish_reason

string

stop, length, tool_calls, content_filter, or another provider-specific value.

usage

object

Token, cost, and carbon accounting.

Hide child attributes

usage.prompt_tokens

integer

usage.completion_tokens

integer

usage.total_tokens

integer

usage.cost

number<double>

Prisme.ai extension. Estimated USD cost computed from pricing.input_per_1m_tokens / pricing.output_per_1m_tokens on the model document.

usage.duration_ms

integer

Prisme.ai extension. Wall-clock duration of the call.

usage.carbon

object

Prisme.ai extension. Carbon-footprint estimate produced by _compute-carbon-footprint.

Introduction

Platform API

Agent Creator API

LLM Gateway API

Governance API

Create a chat completion

Authorizations

Body

Response

Introduction

Platform API

Agent Creator API

LLM Gateway API

Governance API

Documentation Index

Authorizations

Body

Response