The runtime control plane for AI agents.
Between your agent logic and the systems it touches — constraints enforced, decisions logged, unsafe actions blocked.
How components fit together
Every agent request passes through five layers. Budget caps, rate limits, and tenant boundaries are enforced before any external system is reached. Full semantic action-level enforcement (Policy DSL) is on the roadmap.
1. Agent / Copilot
Any framework or custom Python. Sends a message or tool call into KDCube.
2. Semantic Runtime
Routes requests through the Firewall; manages orchestration and streaming.
3. Execution Firewall
Evaluates budget, rate, and tenant constraints before external calls. Issues allow or deny. Semantic per-action policy enforcement is on the roadmap.
4. Sandbox Execution
Subprocess-isolated executor within Docker Compose. No network, no env vars. Workdir filesystem only.
5. External Systems
DBs, APIs, LLMs. Reached only after Firewall allow.
Decision Log — Every allow/deny decision recorded: timestamp, agent, tenant, action. Runs alongside every layer. Self-hosted on your infrastructure.
Six runtime components
Each enforces a specific constraint category.
Semantic Runtime
Orchestration layer: request routing, streaming, and per-tenant context lifecycle.
Execution Firewall
Enforces budget, rate, and tenant constraints before execution; issues allow or deny. Semantic per-action enforcement is on the roadmap.
Sandbox Execution
Subprocess-isolated executor — no network access, no environment variables, workdir filesystem only.
Budget & Cost Controls
Hard spending caps at user, project, and org level — enforced, not just monitored.
Audit Trail
Timestamped log of every decision: agent, tenant, action, constraint, outcome. Self-hosted on your infrastructure.
Auth & Access Control
OIDC identity validation with 4-tier user priority queue enforced at runtime.
Drop-in integration
No rewrite required. Route your existing agent logic through the KDCube runtime.
REST API
Synchronous request/response.
POST /api/v1/chat/send
✅ Available
SSE
Token-by-token streaming over standard HTTP.
GET /api/v1/chat/stream
✅ Available
Socket.IO
Full-duplex real-time streaming.
ws://your-host:8000
✅ Available
Self-hosted: Agent traffic stays on your infrastructure. You control which LLMs, databases, and APIs agents can reach.
Not a framework replacement. LangGraph, LangChain, CrewAI, AutoGen, or custom Python — route through KDCube without modification.
KDCube vs. alternatives
Complementary to your orchestration layer — fills the enforcement gap every other tool leaves open.
Execution Runtime
Agent code runs in the Executor — isolated, no network, no env vars, workdir-only. Tool calls go to the Supervisor, which intercepts via socket proxy and enforces policy before forwarding.
Executor (isolated)
- Runs agent-generated code
- No network, no env vars
- Filesystem: workdir only
- Cannot call tools directly
Supervisor (trusted tools)
- All tool dispatch routes here
- Socket-proxy interception, policy enforced
- Internet access limited by policy
- Returns string, bytes, or structured output
Guarantee: The executor never calls tools directly. Every tool call is intercepted and proxied through the supervisor.
Context Lifecycle & Timeline
Ordered event timeline with shifting cache points, flagged-not-deleted history, and on-demand artifact re-fetch — keeps context windows lean at scale.
Timeline
Validated event sequence; failed steps roll back.
Cache points
Shift forward as conversation grows; preserve context efficiently.
Hidden items
Flagged, not deleted. Replaced with summary text; re-fetchable on demand.
Lazy fetching: Fetch artifacts by name on demand, or pre-pull via scan. Both keep the active context window lean.
Execution Outputs
Two reporting modes. Both produce canonical artifact shapes that integrate with the timeline.
Contract-based
- Declare expected outputs upfront (path, MIME, description)
- Output verified against contract
- Missing or mismatched outputs are errors
Side-effect-based
- All created/updated files reported automatically
- Text → string; binary → base64
- Runtime vs. program errors reported separately
Artifact shape
Every artifact: name, MIME, size, description, readable payload. Integrates with the timeline; triggers downloadable-artifact UI events.
Runtime Internals
Tool interception, error separation, and artifact reporting — the three enforcement mechanics.
Tool interception & proxying
All tool calls from agent code intercepted via socket proxy. The Executor never calls tools directly — the Supervisor enforces policy before forwarding, unconditionally.
Error separation
- Runtime errors — crash, timeout, resource exhaustion
- Program errors — exceptions in agent-generated code with traceback
Both surfaced in structured format; agents distinguish infrastructure from code-level failures.
Artifact reporting
Canonical shape: name, MIME, size, description, readable payload. Integrates with timeline; triggers downloadable-artifact UI events.
Isolation scope: The Executor has zero outbound connections. Network tool calls are permitted only through the Supervisor, explicitly allowed by policy. The Supervisor operates with configurable permissions.