Platform

Ingress (chat-ingress)

Handles all inbound traffic. Authenticates users, enforces rate limits, validates the bundle, enqueues the task, and opens an SSE stream back to the client. Your bundle rarely needs to know about this — the platform wires it up automatically.

Processor (chat-proc)

Dequeues tasks, loads your bundle singleton, and calls execute_core(). Also hosts the Operations API — the REST endpoint that UI widgets call directly (no SSE needed for widget interactions).

SSE Streaming Flow

SSE streaming flow diagram
SSE Streaming Flow Real-time event delivery from bundle to client via Server-Sent Events Client open SSE stream send message Ingress auth · rate limit enqueue task fan-out SSE Redis Queue task buffer Proc dequeue · execute bundle.run() communicator Bundle emit events via comm. SSE stream → client in real time (Redis pub/sub relay)

Modules

Platform Architecture — Detail

Platform architecture detail diagram
Platform Architecture Detail Detailed view of KDCube service modules including ingress, processor, storage, and Redis layers Client Browser HTTPS :443 web-proxy OpenResty TLS termination token unmasking routing chat-ingress :8010 Auth · JWT · Rate Limit /api/chat/* — SSE gateway /api/conversations/* /api/resources/* · opex/* /api/economics/* · ctrl/* Redis Queue per user_type chat-proc :8020 Bundle Loader · LangGraph /integrations/.../ops/{op} Admin: bundle reg · props Admin: secrets · cleanup Communicator (pub/sub) invoke chat·steer·ops bundle communicator → SSE YOUR BUNDLE @agentic_workflow run() · on_bundle_load() tools · skills ReAct Agent loop Communicator Firewall Settings PostgreSQL RDS conversations Redis ElastiCache cache · pub/sub · queue EFS / S3 bundle storage Shared persistence — managed by platform

Services

ServicePortRoleRequired?
web-proxy:443 / :80TLS termination, token unmasking, routingRequired
chat-ingress:8010Auth, SSE/Socket.IO gateway, task enqueueingRequired
chat-proc:8020Bundle execution, integrations REST APIRequired
web-ui:80SPA frontendRequired
kdcube-secretsinternalIn-memory secrets service (secrets never written to disk)Required (local)
metricsinternalAutoscaling metric export (CloudWatch) — not needed for single-nodeOptional
proxylogininternalDelegated auth token exchangeOptional
clamavinternalAntivirus scanning for file attachmentsOptional
exec (on-demand)Ephemeral Docker/Fargate container for isolated code executionOptional

Routing

Path PatternRoutes To
/sse/*, /api/chat/*, /admin/*chat-ingress
/api/integrations/*chat-proc
/auth/*proxylogin (delegated auth only)
/*web-ui

Processor Architecture

The chat-proc service is the execution side of the platform. After ingress admits and enqueues a request, the processor claims it, loads the target bundle, executes the workflow, and streams results back through the relay communicator.

Task Queue Model

The processor uses Redis Lists as its task queue. Ingress pushes task payloads with LPUSH; processor workers claim them with BRPOPLPUSH, atomically moving the item from a ready queue to an inflight queue. This gives FIFO ordering within each lane.

Tasks are partitioned into user-type lanesprivileged, paid, registered, and anonymous. Workers rotate fairly across lanes so no single tier starves the others. Each claimed task is protected by a per-task Redis lock (SET NX EX) to prevent duplicate processing.

Redis Key PatternPurpose
{tenant}:{project}:...:queue:{user_type}Ready queue (one per user-type lane)
{tenant}:{project}:...:queue:inflight:{user_type}Inflight queue (claimed but not yet complete)
{LOCK_PREFIX}:{task_id}Per-task dedup lock
{LOCK_PREFIX}:started:{task_id}Started marker — prevents auto-replay once execution begins

Bundle Loader & Lifecycle

Processor workers load bundles through a registry + singleton cache model:

  1. On startup (and on bundle-update broadcasts), the worker rebuilds its in-memory bundle registry.
  2. At request time, the registry resolves the bundle to a concrete path. Module/singleton cache keys are based on the resolved path.
  3. Built-in example bundles are merged into the registry and copied to shared storage with a versioned path (/bundles/{bundle_id}__{ref}__{sha}).
  4. On update, loader caches are cleared. New requests use the new path; already-running turns continue on the previously loaded path.

This means bundle updates are zero-downtime — running work is never affected, and new work picks up the latest version automatically.

Execution Pipeline

Once a task is claimed, execution follows a fixed sequence:

  1. Receive — claim the task from the ready queue via BRPOPLPUSH and acquire the per-task lock.
  2. Validate — materialize ChatTaskPayload, build ServiceCtx and ConversationCtx.
  3. Load bundle — resolve the target bundle through the registry, load/reuse the singleton.
  4. Execute — run the bundle handler under task timeout, accounting binding, lock renewal, and started-marker renewal. On ECS, scale-in protection is enabled for the duration.
  5. Stream results — the bundle emits events through the ChatCommunicator; the relay forwards them via Redis pub/sub to ingress, which delivers them to the client over SSE.

On success the inflight claim is acked, conversation state moves to idle, and conv_status is emitted. On failure the conversation is set to error and the client receives a chat.error event.

State Management & Recovery

The processor distinguishes two recovery cases based on the started marker:

  • Pre-start claims (lock expired, no started marker) — safe to requeue. The inflight reaper moves the item back to the ready lane.
  • Started tasks (lock expired, started marker present) — not replayed. The conversation is set to error and the client is notified with turn_interrupted. This prevents duplicate side effects from partial execution.

The started marker intentionally outlives the claim lock so that a hard worker restart cannot accidentally let both leases expire and trigger an unsafe replay.

Continuation Mailbox

When a user sends a message to a conversation that is already executing, ingress stores it in a per-conversation ordered mailbox in Redis rather than the main ready queue. The active workflow can inspect this mailbox via the ConversationContinuationSource API (peek_next_continuation(), take_next_continuation()). If the bundle does not consume the item, the processor promotes exactly one pending item back into the ready queue after the current turn completes.

Operations API Surface

The processor exposes REST endpoints that do not require SSE:

  • /integrations/bundles/{tenant}/{project}/{bundle_id}/operations/{op} — bundle-defined operations called directly by UI widgets.
  • Admin endpoints for bundle registration, property management, secrets injection, and cleanup.

These endpoints run inside the processor because they need access to the loaded bundle singleton and its runtime context.

Communicator Integration

The processor uses the relay communicator pattern: bundle events are published to Redis pub/sub channels, ingress subscribes and fans them out over SSE to connected clients. This decouples execution from delivery — the processor never holds SSE connections directly, and horizontal scaling of proc replicas does not affect client connectivity.

Process Topology

Each Uvicorn worker in chat-proc is an independent queue consumer with its own Redis pool, Postgres pool, bundle registry, inflight reaper, and cleanup loop. All workers across all replicas compete for the same Redis task queues. There is no sticky worker-to-conversation affinity — any worker can claim any task, and conversation ownership is per-turn, not per-instance.