Transport Contract

The platform exposes two real-time transports for browser clients. Both deliver the same event envelope; the only difference is the connection mechanism.

Client transport options diagram
Client Transport Options Browser connects via SSE or Socket.IO to platform ingress Browser EventSource / io() GET /sse/stream SSE Transport one-way stream + POST send io.connect() Socket.IO Transport bidirectional events Platform Ingress Chat Relay + Redis Pub/Sub
TransportUse casePeer identifier
SSE Standard browser apps. One-way server-to-client stream; chat requests sent via POST /sse/chat. Client-provided stream_id query param
Socket.IO Apps that need bidirectional messaging or already use Socket.IO. Connection sid (assigned by server)

SSE Endpoints

EndpointMethodPurpose
/sse/streamGETOpen the long-lived event stream. Requires stream_id query param.
/sse/chatPOSTSend a chat message. Returns a synchronous acknowledgement (processing_started, followup_accepted, or steer_accepted).
/sse/conv_status.getPOSTRequest the current conversation status.

SSE Stream Query Parameters

ParamRequiredPurpose
stream_idYesUnique peer identifier for this connection
user_session_idNoReuse an existing authenticated session
bearer_tokenNoAccess token fallback when headers are unavailable
id_tokenNoID token fallback when headers are unavailable
tenantNoOverride tenant for the stream
projectNoOverride project for the stream
i
When to choose SSE: Use SSE for standard browser apps. It works through all CDNs and proxies, uses native EventSource, and requires no extra library. Choose Socket.IO only when you need bidirectional event delivery from client to server beyond the POST /sse/chat send path.

SSE Event Catalog

Each SSE frame carries an event name (transport route) and a JSON data payload with a semantic type. The event name tells the browser which listener fires; the payload type tells your code what the event means.

SSE Event Names

SSE eventPayload typePurpose
readyStream is open and authenticated. Payload includes session_id, user_type, stream_id.
chat_startchat.startTurn accepted and processing started.
chat_stepchat.step or customStructured step update (progress, tool results, decisions).
chat_deltachat.deltaStreaming text chunks (answer, thinking, artifacts).
chat_completechat.completeTurn completed. Contains data.final_answer and optional data.followups.
chat_errorchat.errorTurn failed. Contains data.error, optional data.error_type.
chat_servicechat.service, gateway.*, rate_limit.*Service-level events: rate limits, gateway rejections, queue status.
conv_statusconv.statusConversation state snapshot (idle, in_progress, error).
server_shutdownServer is draining. Reconnect after a short delay.

Common Envelope Shape

All chat events share this JSON structure:

{
  "type": "chat.step",
  "timestamp": "2026-02-26T21:14:05.267Z",
  "ts": 1700000000000,
  "service": {
    "request_id": "...", "tenant": "...",
    "user": "...", "user_type": "registered"
  },
  "conversation": {
    "session_id": "...", "conversation_id": "...",
    "turn_id": "..."
  },
  "event": {
    "agent": "...", "step": "...",
    "status": "started|running|completed|error",
    "title": "...", "markdown": "..."
  },
  "data": { },
  "delta": { },
  "extra": { }
}

Delta Markers

Streaming chunks (chat_delta) use a marker field to fan out to different UI channels:

MarkerMeaningTypical usage
answerAssistant response streamMain answer text rendered in the chat bubble
thinkingReasoning streamInternal analysis, shown in a collapsible panel
canvasArtifact streamDocuments, rendered HTML/JSON content. Uses extra.artifact_name for grouping.
timeline_textTimeline streamShort status entries for an activity log
subsystemStructured JSON payloadsWidgets and tools. Routed by extra.sub_type (e.g. code_exec.status, web_search.filtered_results).

Each delta chunk looks like:

{
  "delta": {
    "text": "Here is the answer.",
    "index": 0,
    "marker": "answer",
    "completed": false
  },
  "extra": {
    "format": "markdown",
    "artifact_name": "...",
    "sub_type": "..."
  }
}
i
Closing a stream channel: When delta.completed is true, the server has finished sending chunks for that marker/artifact. Close the corresponding UI stream.

Usage and Token Counting

After a turn completes, the server emits an accounting.usage event (on the chat_step route) containing a cost breakdown:

{
  "type": "accounting.usage",
  "data": {
    "breakdown": [ ... ],
    "cost_total_usd": 0.0042
  },
  "event": {
    "step": "accounting",
    "markdown": "Token usage: 1,240 in / 380 out"
  }
}

Socket.IO Events

Connection Setup

Connect to the platform namespace and pass authentication fields in the auth payload:

const socket = io(baseUrl, {
  auth: {
    bearer_token: accessToken,
    id_token: idToken,
    user_session_id: sessionId,   // optional: reuse existing session
    tenant: "my-tenant",          // optional override
    project: "my-project"         // optional override
  }
});

On successful connection, the server assigns a sid that acts as the peer stream identifier for targeted event delivery (equivalent to SSE's stream_id).

Event Names and Payloads

Socket.IO events use the same semantic envelope as SSE. The event names match the SSE transport routes:

EventDirectionPayload
readyServer → ClientSession info: session_id, user_type, stream_id
chat_startServer → ClientSame envelope as SSE chat_start
chat_stepServer → ClientSame envelope as SSE chat_step
chat_deltaServer → ClientSame envelope as SSE chat_delta
chat_completeServer → ClientSame envelope as SSE chat_complete
chat_errorServer → ClientSame envelope as SSE chat_error
chat_serviceServer → ClientSame envelope as SSE chat_service
conv_statusServer → ClientSame envelope as SSE conv_status
server_shutdownServer → ClientDrain signal; reconnect with backoff

Namespace and Room Patterns

Events are scoped to the authenticated session. The server manages rooms internally based on session_id. Clients do not join or leave rooms manually. Broadcast events go to all peers in the session room; peer-targeted events go only to the specific sid.

i
Peer targeting from REST: If a widget or integration makes a REST call to /api/integrations/* and includes the KDC-Stream-ID header with the Socket.IO sid, bundle-emitted events will target only that peer instead of broadcasting to the entire session.

Authentication for Clients

The server resolves credentials in a fixed priority order. The first source that provides a token wins.

1. Explicit Headers (highest priority)

Set on REST, SSE POST, and integration requests:

Authorization: Bearer <token>Access token
X-ID-TokenID token
User-Session-IDReuse an existing session

2. SSE / Socket.IO Auth Payload

When headers are unavailable (e.g. EventSource does not support custom headers), pass tokens as query params on the SSE stream URL or in the Socket.IO auth object:

bearer_tokenAccess token
id_tokenID token

3. Cookies (lowest priority)

Fallback for cookie-based / proxylogin deployments. The browser sends these automatically:

__Secure-LATCAccess token cookie
__Secure-LITCID token cookie

Useful Request Headers

HeaderPurpose
KDC-Stream-IDPeer identifier for targeted event delivery from REST/integration calls
X-User-TimezoneUser timezone (e.g. America/New_York) for server-formatted messages
X-User-UTC-OffsetUTC offset in minutes

Response Headers to Observe

HeaderAction
X-Session-IDStore and reuse to maintain session continuity
X-User-TypeResolved user type for the request
Retry-AfterHonor on 429 and 503 responses before retrying

Token Refresh Pattern

When you receive a 401 or 403, refresh your access token through your identity provider and retry the request. For SSE streams, close the current EventSource, obtain fresh tokens, and reconnect with the new credentials. Keep stream_id stable across reconnects so the server can associate the new connection with the same peer.

Error Handling & Reconnection

SSE Reconnect Strategy

Use exponential backoff with jitter. The server does not guarantee sticky connections — any replica may serve your reconnect.

delay = min(30s, 2attempt + jitter(0..1s))
SignalMeaningAction
server_shutdown event Instance is draining Close stream immediately. Reconnect after 1–2s + jitter.
Connection drop (no event) Network issue or scaled-down replica Reconnect with exponential backoff (start 1–2s, cap 30s).
HTTP 503 with {"status":"draining"} Instance is draining (on REST calls) Retry after 1–3s + jitter.

Rate Limit Responses

Rate limits arrive as chat_service events and/or HTTP status codes:

HTTP StatusMeaningAction
429Rate limit exceededBack off 2–5s + jitter. Honor Retry-After header. Max 5 retries.
503Backpressure or drainingBack off 1–3s + jitter. Do not retry immediately.
401 / 403Auth missing or invalidRefresh tokens or redirect to login.

In-stream rate-limit events (rate_limit.denied, rate_limit.warning) include a data.rate_limit object with retry_after_sec, reset_text, and a ready-to-display user_message. Prefer showing user_message directly.

Backpressure Signals

Gateway-level rejections arrive on chat_service with types such as gateway.backpressure, gateway.rate_limit, and gateway.circuit_breaker. These indicate the ingress is protecting the backend. Back off and retry.

Turn Interruption

If the processing worker dies after a turn has started, you may have already rendered partial chat_delta content. The server signals interruption with:

  • conv_status with data.completion = "interrupted"
  • chat_error with data.error_type = "turn_interrupted"

Keep partial output visible, mark the turn as failed, and offer the user a manual retry. Do not auto-resubmit.

Multi-Tab Coordination

Leader Election

Use localStorage or BroadcastChannel to elect a single leader tab. Only the leader maintains the SSE connection. Follower tabs read events from shared storage or request on demand.

Burst Control

Coalesce requests on page load (aim for fewer than 10–15 requests in the first 10 seconds). Serialize chat sends — never fire concurrent POST /sse/chat calls. If polling is unavoidable, use intervals of 5–10s minimum.

Draining / Maintenance Mode

When the platform enters a drain cycle, active SSE streams receive a server_shutdown event with reason: "draining". REST endpoints return 503. This is expected, not fatal. Close connections gracefully and reconnect after a short delay. The load balancer will route you to a healthy replica.

i
No sticky sessions required. Requests can land on any replica. Keep session_id and auth tokens consistent, and the server will associate your requests correctly regardless of which instance handles them.