Streaming

How to work with our Server-Sent Events (SSE)

SSE provides a lightweight one-way stream over HTTP where the server pushes incremental tokens/events to the client for responsive UIs and long-running completions

Key concepts

  • Transport: HTTP with Content-Type: text/event-stream, connection kept open.

  • Events: Lines prefixed by event: and data:; each event ends with a blank line.

  • Heartbeat: Periodic comments (: ping) keep the connection alive.

  • Termination: A final event (e.g., event: done) or stream close.

Typical flow

  1. Client sends a completion request indicating streaming mode.

  2. Server responds with text/event-stream and starts sending tokens/events.

  3. Client renders tokens incrementally and listens for terminal event.

Full stream of SSE Example

SSE Example

How our SSE works

Overview of our streaming SSE

Our streaming communication flow consists of four main components:

  • Client

  • Streaming Service

  • Completion Service

  • AI Provider

The sequence can be broken down into several key phases:

Initial Connection Setup

  • The Client initiates a connection to the Streaming Service using URL endpoint /message-stream/{convo_id}/{session_id}

Subscribe Message Stream

get
Path parameters
convo_idstringRequired
session_idstringRequired
Responses
200

Successful Response

application/json
Responseany
get
/message-stream/{convo_id}/{session_id}

No content

  • Upon connection established, the Streaming Service responds with a connected event

Request Initiation

  • The Client sends a request to the Completion Service using URL endpoint /cortex/completion/user-input

Add User Messages

post
Authorizations
AuthorizationstringRequired
Bearer authentication header of the form Bearer <token>.
Body
contentany ofOptional
stringOptional
or
nullOptional
messageTypeany ofOptional
string ยท enumOptionalPossible values:
or
nullOptional
messageIdany ofOptional
stringOptional
or
nullOptional
selectedStrategyany ofOptional
stringOptional
or
nullOptional
selectedActionany ofOptional
stringOptional
or
nullOptional
agentTaskIdany ofOptional
stringOptional
or
nullOptional
targetTextany ofOptional
stringOptional
or
nullOptional
messageIdsany ofOptionalDefault: []
string[]Optional
or
nullOptional
targetFullContentany ofOptional
stringOptional
or
nullOptional
modelany ofOptional
string ยท enumOptionalPossible values:
or
string ยท enumOptionalPossible values:
or
string ยท enumOptionalPossible values:
or
nullOptional
actionTypestring ยท enumOptionalDefault: userPossible values:
convoIdstringRequired
sessionIdstringRequired
filesstring[]OptionalDefault: []
enableStreamingany ofOptionalDefault: true
booleanOptional
or
nullOptional
Responses
200

Successful Response

application/json
Responseany
post
/cortex/completions/user-input

No content

  • The Completion Service forwards this request to the AI Provider with streaming enabled, the actual endpoint varies by each AI Provider (OpenAI, Azure, Anthropic, Google, ...)

Streaming Process

  • Streaming begins with a message_start event propagating from the Completion Service to the Streaming Service via the streaming bus, then the Streaming Service forwards it to the Client, indicating that the service has started working to generate a response.

  • The AI Provider streams generated tokens incrementally to the Completion Service.

  • Each token is forwarded along with the new_token events through the chain: AI Provider โ†’ Completion Service โ†’ Streaming Service โ†’ Client.

  • This token streaming process repeats multiple times until text generation is complete.

Completion and Persistence

  • The Completion Service signals message_end when text generation is complete.

  • The response message is persisted internally by the Completion Service.

  • A message_ready event is sent to indicate the full response message is now available for further processing (copy, delete, generate text-to-speech, ...).

  • The Completion Service returns a 200 OK response to the original client request.

Connection Termination

  • The sequence ends with the Streaming Service closing the connection to the Client.

List of SSE Events

Beside the main SSE Events, there are others that provide extra information about the completion request, the details are as below:

Event: connected

Event: connected - The stream is connected

Typescript Interface

Example Data:

Event: user_message

Event: user_message - The input message is successfully saved and can be used for further process (copy, delete, generate text-to-speech, ...)

Typescript Interface

Example Data:

Event: message_start

Event: message_start - The reply message is about to be sent

Typescript Interface

Example Data:

Event: new_token

Event: new_token - A new token is sent

Typescript Interface

Example Data:

Event: message_end

Event: message_end - The reply message is complete

Typescript Interface

Example Data:

Event: message_ready

Event: message_ready - The reply message is successfully saved and can be used for further process (copy, delete, generate text-to-speech, ...)

Typescript Interface

Example Data:

Last updated

Was this helpful?