Platform
Ingress (chat-ingress)
Handles all inbound traffic. Authenticates users, enforces rate limits, validates the bundle, enqueues the task, and opens an SSE stream back to the client. Your bundle rarely needs to know about this — the platform wires it up automatically.
Processor (chat-proc)
Dequeues tasks, loads your bundle singleton, and calls execute_core(). Also hosts the Operations API — the REST endpoint that UI widgets call directly (no SSE needed for widget interactions).
SSE Streaming Flow
Modules
Platform Architecture — Detail
Services
| Service | Port | Role | Required? |
|---|---|---|---|
| web-proxy | :443 / :80 | TLS termination, token unmasking, routing | Required |
| chat-ingress | :8010 | Auth, SSE/Socket.IO gateway, task enqueueing | Required |
| chat-proc | :8020 | Bundle execution, integrations REST API | Required |
| web-ui | :80 | SPA frontend | Required |
| kdcube-secrets | internal | In-memory secrets service (secrets never written to disk) | Required (local) |
| metrics | internal | Autoscaling metric export (CloudWatch) — not needed for single-node | Optional |
| proxylogin | internal | Delegated auth token exchange | Optional |
| clamav | internal | Antivirus scanning for file attachments | Optional |
| exec (on-demand) | — | Ephemeral Docker/Fargate container for isolated code execution | Optional |
Routing
| Path Pattern | Routes To |
|---|---|
/sse/*, /api/chat/*, /admin/* | chat-ingress |
/api/integrations/* | chat-proc |
/auth/* | proxylogin (delegated auth only) |
/* | web-ui |
Processor Architecture
The chat-proc service is the execution side of the platform. After ingress admits and enqueues a request, the processor claims it, loads the target bundle, executes the workflow, and streams results back through the relay communicator.
Task Queue Model
The processor uses Redis Lists as its task queue. Ingress pushes task payloads with LPUSH; processor workers claim them with BRPOPLPUSH, atomically moving the item from a ready queue to an inflight queue. This gives FIFO ordering within each lane.
Tasks are partitioned into user-type lanes — privileged, paid, registered, and anonymous. Workers rotate fairly across lanes so no single tier starves the others. Each claimed task is protected by a per-task Redis lock (SET NX EX) to prevent duplicate processing.
| Redis Key Pattern | Purpose |
|---|---|
{tenant}:{project}:...:queue:{user_type} | Ready queue (one per user-type lane) |
{tenant}:{project}:...:queue:inflight:{user_type} | Inflight queue (claimed but not yet complete) |
{LOCK_PREFIX}:{task_id} | Per-task dedup lock |
{LOCK_PREFIX}:started:{task_id} | Started marker — prevents auto-replay once execution begins |
Bundle Loader & Lifecycle
Processor workers load bundles through a registry + singleton cache model:
- On startup (and on bundle-update broadcasts), the worker rebuilds its in-memory bundle registry.
- At request time, the registry resolves the bundle to a concrete path. Module/singleton cache keys are based on the resolved path.
- Built-in example bundles are merged into the registry and copied to shared storage with a versioned path (
/bundles/{bundle_id}__{ref}__{sha}). - On update, loader caches are cleared. New requests use the new path; already-running turns continue on the previously loaded path.
This means bundle updates are zero-downtime — running work is never affected, and new work picks up the latest version automatically.
Execution Pipeline
Once a task is claimed, execution follows a fixed sequence:
- Receive — claim the task from the ready queue via
BRPOPLPUSHand acquire the per-task lock. - Validate — materialize
ChatTaskPayload, buildServiceCtxandConversationCtx. - Load bundle — resolve the target bundle through the registry, load/reuse the singleton.
- Execute — run the bundle handler under task timeout, accounting binding, lock renewal, and started-marker renewal. On ECS, scale-in protection is enabled for the duration.
- Stream results — the bundle emits events through the
ChatCommunicator; the relay forwards them via Redis pub/sub to ingress, which delivers them to the client over SSE.
On success the inflight claim is acked, conversation state moves to idle, and conv_status is emitted. On failure the conversation is set to error and the client receives a chat.error event.
State Management & Recovery
The processor distinguishes two recovery cases based on the started marker:
- Pre-start claims (lock expired, no started marker) — safe to requeue. The inflight reaper moves the item back to the ready lane.
- Started tasks (lock expired, started marker present) — not replayed. The conversation is set to
errorand the client is notified withturn_interrupted. This prevents duplicate side effects from partial execution.
The started marker intentionally outlives the claim lock so that a hard worker restart cannot accidentally let both leases expire and trigger an unsafe replay.
Continuation Mailbox
When a user sends a message to a conversation that is already executing, ingress stores it in a per-conversation ordered mailbox in Redis rather than the main ready queue. The active workflow can inspect this mailbox via the ConversationContinuationSource API (peek_next_continuation(), take_next_continuation()). If the bundle does not consume the item, the processor promotes exactly one pending item back into the ready queue after the current turn completes.
Operations API Surface
The processor exposes REST endpoints that do not require SSE:
/integrations/bundles/{tenant}/{project}/{bundle_id}/operations/{op}— bundle-defined operations called directly by UI widgets.- Admin endpoints for bundle registration, property management, secrets injection, and cleanup.
These endpoints run inside the processor because they need access to the loaded bundle singleton and its runtime context.
Communicator Integration
The processor uses the relay communicator pattern: bundle events are published to Redis pub/sub channels, ingress subscribes and fans them out over SSE to connected clients. This decouples execution from delivery — the processor never holds SSE connections directly, and horizontal scaling of proc replicas does not affect client connectivity.
Process Topology
Each Uvicorn worker in chat-proc is an independent queue consumer with its own Redis pool, Postgres pool, bundle registry, inflight reaper, and cleanup loop. All workers across all replicas compete for the same Redis task queues. There is no sticky worker-to-conversation affinity — any worker can claim any task, and conversation ownership is per-turn, not per-instance.