KDCube Platform Documentation
Self-hosted, multi-tenant AI platform and SDK. Architecture, configuration, deployment, and SDK — from system to bundle.
Python 3.11+ · SSE Streaming · ReAct Agent · Multi-Tenant · Docker / ECS · OpenAI · Anthropic · Gemini
Core Concepts
Multi-tenant platform architecture, Ingress and Processor services, and system overview.
Quick Start
Install the CLI, configure your first bundle, and run the platform locally.
Platform Architecture
Services, modules, request lifecycle, and multi-tenant execution model.
Communication System
SSE streaming, Socket.IO, REST API, and channeled output protocol.
Client Integration
Browser transport contract, SSE events, Socket.IO, authentication, and reconnect behavior.
Configuration
Environment variables, bundle config, secrets injection, and runtime settings.
Deployment & Monitoring
Docker Compose, ECS Fargate, Kubernetes, and observability setup.
Platform Economics
Per-tenant cost accounting, budget caps, rate limiting, and usage tracking.
Security & Governance
Prompt injection mitigation, tool sandboxing, tenant isolation, and audit trails.
Application SDK
Bundle anatomy, tools, skills, widgets, storage, workflow, and deployment.
ReAct Agent
Timeline-first decision loop, execution runtime, knowledge space, and context management.
ReAct v2 — Full Reference
Interactive deep-dive: multi-channel streaming, custom tools, source management, and code examples.
Claude Code Agent
Administrator-facing local agent runtime with per-turn usage accounting, bundle-controlled workspace roots, optional git-backed session continuity, and direct integration into bundle UIs.
Claude Code Agent Support
KDCube now supports Claude Code as a bundle-usable agent runtime. This is intended primarily for administrator-controlled scenarios inside bundle applications: knowledge-base maintenance, requirements management, internal repo and document work, migration assistance, and similar operations where a trusted operator explicitly drives the task.
What is included
- Native Claude Code runner inside the SDK
- Per-turn token and spend accounting
- Optional git-backed Claude session continuity
- Bundle-level model selection and UI integration
Important safety note
Claude Code is a local agent runtime. KDCube cannot guarantee that a hostile prompt or attacker will not use that runtime to exfiltrate files from the selected local workspace root or surrounding environment. Because of that, exposing Claude Code to non-admin end users in a bundle is your own risk.
Documentation and reference bundle
Accountability and spend visibility
Claude Code usage is accounted as normal LLM spend in KDCube. Each turn can expose the model, token usage, cache activity, request count, and total cost directly in the bundle UI.
service_type = llm,provider = anthropic,runtime = claude_code- usage buckets include input, output, cache read, cache write, and request count
- bundles can persist and render
model,usage, andcost_usdper assistant turn
How bundles wire it in
The bundle chooses the local Claude root, builds a session-store config, creates the agent, and runs the turn through the runtime wrapper. The wrapper bootstraps git-backed continuity when enabled and returns the final text plus structured usage metadata.
Bundle integration pattern
workspace_root = kb_admin_workspace_root(storage)
session_store = ClaudeCodeSessionStoreConfig(
implementation=settings.CLAUDE_CODE_SESSION_STORE_IMPLEMENTATION,
git_repo=settings.CLAUDE_CODE_SESSION_GIT_REPO,
local_root=workspace_root / ".claude",
tenant=payload.tenant,
project=payload.project,
user_id=str(payload.user.user_id),
conversation_id=conversation_id,
agent_name="kb-admin",
)
agent = ClaudeCodeAgent.from_current_context(
agent_name="kb-admin",
workspace_path=workspace_root,
model=selected_model,
allowed_tools=["Read", "Edit", "Grep", "Bash"],
permission_mode="acceptEdits",
)
result = await run_claude_code_turn(
agent=agent,
prompt=prompt,
kind=turn_kind,
session_store=session_store,
refresh_support_files=refresh_workspace_support_files,
)
Streaming reaches the bundle UI through the normal communicator path: the bundle can forward deltas during execution and then store final_text, model, usage, and cost_usd for durable per-turn display.