Streaming

How to work with our Server-Sent Events (SSE)

SSE provides a lightweight one-way stream over HTTP where the server pushes incremental tokens/events to the client for responsive UIs and long-running completions

Key concepts

Transport: HTTP with Content-Type: text/event-stream, connection kept open.
Events: Lines prefixed by event: and data:; each event ends with a blank line.
Heartbeat: Periodic comments (: ping) keep the connection alive.
Termination: A final event (e.g., event: done) or stream close.

Typical flow

Client sends a completion request indicating streaming mode.
Server responds with text/event-stream and starts sending tokens/events.
Client renders tokens incrementally and listens for terminal event.

Full stream of SSE Example

How our SSE works

Our streaming communication flow consists of four main components:

Client
Streaming Service
Completion Service
AI Provider

The sequence can be broken down into several key phases:

Initial Connection Setup

The Client initiates a connection to the Streaming Service using URL endpoint /message-stream/{convo_id}/{session_id}

Subscribe Message Stream

get

Path parameters

convo_idstringRequired

session_idstringRequired

Responses

200

Successful Response

application/json

Responseany

422

Validation Error

application/json

get

/message-stream/{convo_id}/{session_id}

GET /message-stream/{convo_id}/{session_id} HTTP/1.1
Accept: */*

No content

Upon connection established, the Streaming Service responds with a connected event

Request Initiation

The Client sends a request to the Completion Service using URL endpoint /cortex/completion/user-input

Add User Messages

post

Authorizations

AuthorizationstringRequired

Bearer authentication header of the form Bearer <token>.

Header parameters

x-zitadel-org-idstringOptional

Custom Organization ID header, use in Organization Switch.

Body

contentany ofOptional

stringOptional

nullOptional

messageTypeany ofOptional

string · enumOptionalPossible values:

nullOptional

messageIdany ofOptional

stringOptional

nullOptional

selectedStrategyany ofOptional

stringOptional

nullOptional

selectedActionany ofOptional

stringOptional

nullOptional

agentTaskIdany ofOptional

stringOptional

nullOptional

targetTextany ofOptional

stringOptional

nullOptional

messageIdsany ofOptionalDefault: []

string[]Optional

nullOptional

targetFullContentany ofOptional

stringOptional

nullOptional

modelany ofOptional

string · enumOptionalPossible values:

nullOptional

actionTypestring · enumOptionalDefault: userPossible values:

convoIdstringRequired

sessionIdstringRequired

filesstring[]OptionalDefault: []

enableStreamingany ofOptionalDefault: true

booleanOptional

nullOptional

Responses

200

Successful Response

application/json

Responseany

422

Validation Error

application/json

post

/cortex/completions/user-input

POST /cortex/completions/user-input HTTP/1.1
Authorization: Bearer YOUR_SECRET_TOKEN
Content-Type: application/json
Accept: */*
Content-Length: 323

{
  "content": "text",
  "messageType": "user-idea",
  "messageId": "text",
  "selectedStrategy": "text",
  "selectedAction": "text",
  "agentTaskId": "text",
  "targetText": "text",
  "messageIds": [
    "text"
  ],
  "targetFullContent": "text",
  "model": "gpt-3.5-turbo",
  "actionType": "user",
  "convoId": "text",
  "sessionId": "text",
  "files": [
    "text"
  ],
  "enableStreaming": true
}

No content

The Completion Service forwards this request to the AI Provider with streaming enabled, the actual endpoint varies by each AI Provider (OpenAI, Azure, Anthropic, Google, ...)

Streaming Process

Streaming begins with a message_start event propagating from the Completion Service to the Streaming Service via the streaming bus, then the Streaming Service forwards it to the Client, indicating that the service has started working to generate a response.
The AI Provider streams generated tokens incrementally to the Completion Service.
Each token is forwarded along with the new_token events through the chain: AI Provider → Completion Service → Streaming Service → Client.
This token streaming process repeats multiple times until text generation is complete.

Completion and Persistence

The Completion Service signals message_end when text generation is complete.
The response message is persisted internally by the Completion Service.
A message_ready event is sent to indicate the full response message is now available for further processing (copy, delete, generate text-to-speech, ...).
The Completion Service returns a 200 OK response to the original client request.

Connection Termination

The sequence ends with the Streaming Service closing the connection to the Client.

List of SSE Events

Beside the main SSE Events, there are others that provide extra information about the completion request, the details are as below:

Event: connected

Event: connected - The stream is connected

Typescript Interface

interface SSEData {
  message: string;
}

Example Data:

{
  "message": "Connection established"
}

Event: user_message

Event: user_message - The input message is successfully saved and can be used for further process (copy, delete, generate text-to-speech, ...)

Typescript Interface

interface UserMessage {
  _id: string;
  createdAt: string;
  modifiedAt: string;
  botId: string;
  userId: string;
  content: string;
  role: string;
  model: string;
  messageType: string;
  status: string;
}

Example Data:

{
  "_id": "678097bc45b174911deeeb44",
  "createdAt": "2025-01-10T03:45:00.534383",
  "modifiedAt": "2025-01-10T03:45:00.534384",
  "botId": "6628fddb8c9b707741f7b551",
  "userId": "255317220014445068",
  "content": "hello",
  "role": "user",
  "model": "azure-gpt-4o",
  "messageType": "user-question",
  "status": "activate"
}

Event: message_start

Event: message_start - The reply message is about to be sent

Typescript Interface

interface MessageStartEvent {
  role: string; // The role of the reply message, e.g. assistant
  gid: string; // The id of the reply message
}

Example Data:

{
  "role": "assistant",
  "gid": "678097bc45b174911deeeb45"
}

Event: new_token

Event: new_token - A new token is sent

Typescript Interface

interface NewTokenEvent {
  role: string;
  token: string; // The new token to append
  gid: string; // The gid of the reply message
}

Example Data:

{
  "role": "assistant",
  "token": "Hello",
  "gid": "678097bc45b174911deeeb45"
}

Event: message_end

Event: message_end - The reply message is complete

Typescript Interface

interface MessageEndEvent {
  role: string;
  gid: string;
}

Example Data:

{
  "role": "assistant",
  "gid": "678097bc45b174911deeeb45"
}

Event: message_ready

Event: message_ready - The reply message is successfully saved and can be used for further process (copy, delete, generate text-to-speech, ...)

Typescript Interface

interface MessageReadyEvent {
  messageIds: string[]; // List of new messages that are successfully saved
}

Example Data:

{
  "messageIds": ["678097bc45b174911deeeb45"]
}

PreviousData Rooms NextDocument Databases

Last updated 4 months ago

Was this helpful?

hashtagKey concepts

hashtagTypical flow

hashtagFull stream of SSE Example

hashtagHow our SSE works

hashtagInitial Connection Setup

hashtagSubscribe Message Stream

hashtagRequest Initiation

hashtagAdd User Messages

hashtagStreaming Process

hashtagCompletion and Persistence

hashtagConnection Termination

hashtagList of SSE Events

Key concepts

Typical flow

Full stream of SSE Example

How our SSE works

Initial Connection Setup

Subscribe Message Stream

Request Initiation

Add User Messages

Streaming Process

Completion and Persistence

Connection Termination

List of SSE Events