The enforcement layer between AI agents and production systems.
KDCube is a self-hosted agent enforcement runtime that blocks unsafe actions before execution with hard budget caps, tenant isolation, and subprocess sandboxing at the infrastructure layer.
For engineering teams shipping AI agents to production.
- Subprocess-isolated execution — no network, no env vars. All tool calls proxied through a UID-verified supervisor. Agent code cannot reach external systems directly.
- Hard budget caps per user, project, and org — enforced outside agent reasoning via ledger-backed reservation. Prompts cannot override them.
- Tenant namespacing across gateway keys, storage, database, and accounting — cross-tenant access blocked at the infrastructure layer before execution.
Governed Execution #
Budget, rate, and tenant constraints checked before any external system is reached. Wrong actions are blocked — not logged after the fact.
Tenant Isolation #
Gateway keys, storage, database, and accounting all scoped by validated tenant. Cross-tenant access blocked before it reaches any backend.
Budget Enforcement #
Hard spending caps per user, project, and org — implemented via ledger-backed reservation. When a limit is hit, execution stops. No prompt overrides.
Self-Hosted & Auditable #
MIT licensed. Runs on your infrastructure. No vendor access to agent traffic. Decision log with every allow/deny stored on your systems.
Frameworks build agents. Observability logs them. Neither enforces what agents are allowed to do. #
Agent frameworks handle orchestration. Observability tools record what happened. Guardrails filter output text. Sandboxes isolate compute. Each solves one piece.
None of them enforce budget limits, tenant boundaries, or tool-call policy before execution. That gap is what blocks enterprise teams from shipping AI to customers.
Subprocess-isolated code execution #
- Isolated subprocess within Docker Compose deployment
- No external network access by default
- Sensitive env vars stripped; minimal filtered environment; workdir filesystem only
✅ Available
Budget controls and rate limiting #
- Per-user, per-project, and per-org spending caps with multi-tier accounting rollups
- Reservation/commit economics checks at admission with hard-limit enforcement
- Request frequency and token throughput constraints
✅ Available
Multi-tenant isolation patterns #
- Tenant boundary validated on every request
- Cross-tenant DB access blocked at the gateway layer
- Knowledge base and memory access scoped by tenant to preserve isolation boundaries
✅ Available
Observability Tool
Does: Logs and traces agent behavior
Misses: Read-only — records damage after it occurs, cannot block
KDCube
Does: Enforces budget, tenant, and execution constraints before any external system is reached
Outcome: Trustworthy results
KDCube is the control layer between your agent code and production systems.
What KDCube enforces #
Five runtime controls for shipping AI agents to customers.
Budget Enforcement #
Hard spending caps at user, project, and org level. Implemented via ledger-backed reservation. When a limit is hit, execution stops. No prompt can override a hard cap.
✅ Available · Prevents runaway LLM spend
Tenant Isolation #
Every request scoped to a validated tenant. Gateway keys, storage paths, database queries, and cost accounting are all namespaced. Cross-tenant access blocked before it reaches any backend.
✅ Available · Prevents cross-tenant data leakage
Subprocess Isolation #
Code executes in an isolated process — no network, no env vars, workdir filesystem only. All tool calls route through a UID-verified supervisor proxy via Unix socket (SO_PEERCRED).
✅ Available · Prevents data exfiltration
Rate Limiting #
Per-user request frequency and token throughput constraints enforced in Redis. Burst and hourly windows. Rate-limit events emitted and logged.
✅ Available · Prevents API abuse
Decision Logging #
Every enforcement decision — budget check, rate limit, tenant validation — timestamped and stored. Monitoring API exposes circuit breaker state, queue stats, and throttling events.
✅ Available · Provides audit evidence
Policy DSL & Approval Gates Roadmap #
Declarative constraint definitions per agent role. Workflow invariants. Cross-agent approval gates. Deterministic pre-execution enforcement.
🔮 Roadmap
Built for teams shipping AI to customers #
KDCube is the control layer between your agent code and production systems.
B2B SaaS with Embedded AI
You're adding AI capabilities to your product. Your customers expect tenant isolation, cost controls, and auditability. KDCube provides the runtime controls your security review requires — without building them yourself.
CRM Access Boundaries
An agent assists sales reps. A prompt injection attempt asks the agent to retrieve data from a different tenant. KDCube enforces tenant boundaries at the gateway layer — cross-tenant data access is blocked before it reaches the database and storage layer.
Platform Engineering Teams
You're building internal AI infrastructure for multiple product teams. KDCube gives each team tenant-scoped execution with centralized budget controls and monitoring.
Data Boundary Enforcement
KDCube restricts outbound API calls to an approved allowlist. Unapproved endpoints are blocked, and execution has no external network access by default.
Cost Containment
An agent runs complex analytical tasks. Left unconstrained, a single poorly-scoped prompt can trigger thousands of dollars in LLM token consumption. KDCube enforces per-user, per-project, and per-org spending caps. Hard limits cannot be exceeded.
Secure Code Execution
KDCube runs agent-generated code in an isolated subprocess with no external network access and no environment variable access by default.
Why KDCube #
What sets the platform apart.
Full stack #
UI + backend + SDK + ops tooling shipped together. One cohesive platform, not a stitched pipeline.
Agent-first #
Built around agent execution, not chat completion. Bring any framework — no tool-calling lock-in.
Multi-tenant #
Isolation enforced across request routing, storage, and economic accounting. Serve multiple customers from one deployment.
Provenance by default #
Every source, tool call, and citation is tracked in the timeline. Perplexity-style traceability built-in.
Channeled streaming #
SSE/Socket.IO fan-out with typed event channels. Live bundle UIs and role-based event filtering.
Feedback-aware #
User and system feedback events captured in the timeline for evaluation and model improvement loops.
Self-hosted. Open source. MIT licensed. #
Deploy on your infrastructure. No vendor access to your agent traffic. No per-seat fees. Infrastructure costs only.
Quick start #
git clone https://github.com/kdcube/kdcube-ai-app
cd kdcube-ai-app/app/ai-app/deployment/docker/all_in_one
# Setup
mkdir -p data/{kdcube-storage,exec-workspace,postgres,redis}
docker build -t py-code-exec:latest -f Dockerfile_Exec ../../..
docker compose build postgres-setup
docker compose run --rm postgres-setup
# Configure: add ANTHROPIC_API_KEY to .env.backend
docker compose build && docker compose up -d
# Runtime ready at localhost:8000
CLI installer: pipx install kdcube-apps-cli && kdcube-apps-cli
What you get #
- Deploy in under 30 minutes — Docker Compose, single command
- Predictable pricing model — no per-seat fees, no usage-based platform fees, and full self-hosted cost control
- Your data stays on your infrastructure — no calls home, no telemetry you did not configure
No usage-based fees. No per-seat pricing. The runtime is free (MIT). You pay only for your infrastructure.
Where we are headed #
The current runtime already provides economic controls, isolation, and observability signals (monitoring endpoints, queue pressure, circuit-breaker state) used for scaling decisions. The following items are not yet shipped.
Policy DSL Roadmap
Declare agent permissions in a human-readable policy language. Define what actions, data scopes, and spend limits are permitted per agent role.
Deterministic Enforcement Engine Roadmap
Pre-execution evaluation that produces a verifiable allow/deny decision before any tool call fires, with no probabilistic components.
Workflow Invariants Roadmap
Declare required steps in a workflow. Prevent agents from skipping approval gates, notification steps, or compliance checkpoints.
Cross-Agent Approval Gates Roadmap
Require a second agent, human-in-the-loop confirmation, or external approval before high-impact actions execute.
Follow progress: github.com/kdcube/kdcube-ai-app
KDCube vs. the alternatives #
KDCube is complementary to agent frameworks and observability tools. It adds the enforcement layer they don't provide.
| Category | What it does | What it does NOT do |
|---|---|---|
| Infrastructure sandbox (Docker, gVisor) |
Isolates compute environment | No knowledge of agent intent; no budget, tenant, or semantic constraints |
| Guardrails wrapper (NeMo Guardrails, etc.) |
Filters LLM output text | Does not intercept tool calls or API actions; fires after generation |
| Observability tool (LangSmith, Helicone) |
Logs and traces agent behavior | Read-only; cannot block actions; logs damage after it occurs |
| Agent framework (LangGraph, LangChain, CrewAI) |
Orchestrates agent logic and tool calling | No enforcement layer; delegates execution control to the agent |
| Managed platform (Bedrock Agents, Vertex AI) |
Hosted agent execution with vendor governance | Vendor lock-in; data leaves your infrastructure; limited customization |
| KDCube | Enforces budget, tenant, and execution constraints before external calls. Self-hosted. Open source. | Not a framework, not an LLM proxy, not a log aggregator |
Show detailed feature comparison matrix
| Feature | KDCube self-hosted runtime | LangGraph Platform stateful graph exec | CrewAI multi-agent orch. | AutoGen / AG2 MS multi-agent | OpenAI Assistants cloud-hosted runtime | AWS Bedrock Agents cloud-hosted runtime | Vertex AI Agents GCP cloud runtime | AgentScope Runtime distributed multi-agent |
|---|---|---|---|---|---|---|---|---|
| Pre-execution policy gate | ✅ | ✗ | ✗ | ✗ | ✗ | Partial | Partial | ✗ |
| Per-tenant budget caps & rate limits | ✅ | ✗ | ✗ | ✗ | Partial | ✗ | ✗ | ✗ |
| Tenant boundary isolation | ✅ | Partial | ✗ | ✗ | ✗ | Partial | Partial | ✗ |
| Subprocess / sandbox isolation | ✅ | ✗ | ✗ | Partial | Partial | Partial | Partial | Partial |
| Audit trail & decision logging | ✅ | Partial | ✗ | ✗ | Partial | ✅ | ✅ | Partial |
| Real-time streaming (SSE + WebSocket) | ✅ | Partial | Partial | Partial | ✅ | Partial | Partial | Partial |
| Self-hosted / on-premises | ✅ | Partial | ✅ | ✅ | ✗ | ✗ | ✗ | ✅ |
| Open-source / auditable | ✅ | ✅ | ✅ | ✅ | ✗ | ✗ | ✗ | ✅ |
| Multi-protocol clients (REST + SSE + WS) | ✅ | ✗ | ✗ | ✗ | Partial | ✗ | ✗ | ✗ |
| Built-in knowledge base / RAG | ✅ | ✗ | Partial | Partial | ✅ | ✅ | ✅ | Partial |
| Token / cost accounting per tenant | ✅ | ✗ | ✗ | ✗ | Partial | Partial | Partial | ✗ |
| Multi-model routing (OpenAI + Anthropic + Gemini) | ✅ | ✅ | ✅ | ✅ | ✗ | Partial | Partial | ✅ |
| MCP tool integration | ✅ | Partial | Partial | Partial | ✗ | ✗ | ✗ | ❓ |
| Citation / provenance tracking | ✅ | ✗ | ✗ | ✗ | Partial | Partial | Partial | ✗ |
| Structured feedback & quality signals (user reactions + machine Gate Agent, confidence ≥ 0.70) | ✅ | ✗ | ✗ | ✗ | Partial | ✗ | ✗ | ✗ |
| Skills system (Anthropic-compatible SKILL.md, namespaced reusable behaviors) | ✅ | ✗ | Partial | Partial | Partial | ✗ | ✗ | ✗ |
| Hot-loadable bundle plugins (no platform restart required) | ✅ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Channeled multi-stream output (thinking / answer / followup channels over single token stream) | ✅ | ✗ | ✗ | ✗ | Partial | ✗ | ✗ | ✗ |
We're looking for design partners shipping customer-facing AI. If you're building multi-tenant AI agents and need runtime controls, we'd like to talk.
MIT Licensed · Self-Hosted · Open Source