KDCube — safer way to build AI

Empower your customers with an AI assistant.
Self-hosted, multi-tenant platform with SDK for building trustful AI assistants, copilots, and agentic apps with control, provenance and auditability.

ReAct v2 Timeline-First Agent — structured turn memory with provenance by default
Multi-tenant by design — tenant isolation enforced across gateway, storage, and budget accounting
Channeled streaming + live widgets — SSE/Socket.IO fan-out with dynamic bundle UIs

For architecture and control details, see Docs and Security.

Get Started on GitHub Schedule demo

See Product Overview → See Runtime Demo →

Star on GitHub • MIT License • Self-Hosted • Open Source

Agent-First #

ReAct v2 Timeline-First Agent is the KDCube signature. No tool-calling framework lock-in — bring your own tools or use the SDK.

Economic Accounting #

Usage, budgets, and rate limits tracked per user, project, and organization. Hard limits enforced outside prompt logic.

Multi-Tenant Platform #

Tenant separation enforced across request routing, storage, and accounting. Host copilots for multiple customers on one stack.

Self-Hosted & Open Source #

Docker Compose quickstart. Deploy on EC2, ECS/Fargate, or Kubernetes. MIT License. No vendor data path required.

KDCube: When infrastructure sandboxing is not enough #

Compute isolation is necessary, but companies also need policy, spend, and tenant controls before actions are executed.

Infrastructure sandboxing

Good at isolating compute and limiting process-level access.

Not enough for business controls like approval policy, tenant scope, and cost containment.

What KDCube adds

Action mediation: privileged operations pass through controlled runtime boundaries.
Cost governance: spend and rate controls are enforced independently of prompt output.
Tenant enforcement: request and data access paths are scoped by tenant and project.
Auditability: decisions are traceable on infrastructure you control.

The enterprise risk is policy failure, not only process escape. KDCube focuses on preventing unsafe or out-of-scope actions before they execute.

How KDCube works #

Admit, enforce, execute, and audit. Implementation detail is in Docs.

Subprocess-isolated code execution #

Isolated subprocess within Docker Compose deployment
No external network access by default
Sensitive env vars stripped; minimal filtered environment; workdir filesystem only

✅ Available

Budget controls and rate limiting #

Per-user, per-project, and per-org spending caps with multi-tier accounting rollups
Reservation/commit economics checks at admission with hard-limit enforcement
Request frequency and token throughput constraints

✅ Available

Multi-tenant isolation patterns #

Tenant boundary validated on every request
Cross-tenant DB access blocked at the gateway layer
Knowledge base scoping per tenant (validate KB isolation boundaries during setup)

✅ Available

Streaming runtime with tool orchestration #

REST, SSE (Server-Sent Events), Socket.IO
Token-by-token streaming responses
Composable Skills and tool namespaces (local + MCP) via dynamic bundles

✅ Available

Decision logging (request/response) #

Every allow/block decision timestamped
Self-hosted on your infrastructure
Control plane monitoring dashboard included

✅ Available

Policy DSL & deterministic enforcement Roadmap #

Declarative constraint definitions per agent role
Verifiable allow/deny before any tool call fires
Workflow invariants and cross-agent approval gates

🔮 Roadmap

Where runtime enforcement matters #

Enterprise scenarios where pre-execution control is required.

Refund Authorization

KDCube enforces a maximum refund per transaction. Requests above the limit require human approval.

$500 hard cap enforced

CRM Access Boundaries

KDCube enforces tenant boundaries at the gateway layer, blocking cross-tenant access before it reaches the database.

Zero cross-tenant leaks

Approval Workflows

Agents can draft contracts, while approval-gated final actions remain controlled. Explicit workflow-step invariant gating is roadmap.

Enforced approval gates Roadmap

Data Boundary Enforcement

KDCube restricts outbound API calls to an approved allowlist. Unapproved endpoints are blocked, and execution has no external network access by default.

Allowlist-only outbound calls

Cost Containment

KDCube enforces per-user, per-project, and per-org spending caps so hard limits cannot be exceeded.

Potential 60–80% infra savings in self-hosted scenarios*

Secure Code Execution

KDCube runs agent-generated code in an isolated subprocess with no external network access and no environment variable access by default.

Zero network egress by default

*Estimate range depends on workload profile, model mix, traffic shape, and infrastructure/operations choices.

Why KDCube #

Seven differentiators that define the platform — sourced from README Highlights.

Full stack #

UI + backend + SDK + ops tooling shipped together. One cohesive platform, not a stitched pipeline.

Agent-first #

ReAct v2 Timeline-First Agent is the KDCube signature. No tool-calling framework lock-in.

Multi-tenant #

Isolation enforced across request routing, storage, and economic accounting. Host copilots for multiple customers.

Provenance by default #

Every source, tool call, and citation is tracked in the timeline. Perplexity-style traceability built-in.

Channeled streaming #

SSE/Socket.IO fan-out with typed event channels. Live bundle UIs and role-based event filtering.

Feedback-aware #

User and system feedback events captured in the timeline for evaluation and model improvement loops.

Open-source and self-hosted #

Fast evaluation path #

Use the all-in-one deployment to validate fit quickly, then move to managed infra patterns for production.

Typical first environment in under an hour
Clear migration path to production topology
Full setup steps in Deployment Model

What you get #

Business-safe defaults for spend, tenant separation, and controlled execution
Deployment flexibility across VPS, Docker Compose, and Kubernetes
Data stays in your environment by default

KDCube is MIT licensed and fully open source on GitHub. Deploy on your own infrastructure and operate with your own controls.

No runtime license fee. Your costs are infrastructure and model usage, which vary by deployment size and provider.

View on GitHub →

Where we are headed #

The current runtime already provides economic controls, isolation, and Observability signals (monitoring endpoints, queue pressure, circuit-breaker state) used for scaling decisions. The following items are additional roadmap capabilities and are not yet shipped.

Policy DSL Roadmap

Declare agent permissions in a human-readable policy language. Define what actions, data scopes, and spend limits are permitted per agent role.

Deterministic Enforcement Engine Roadmap

Pre-execution evaluation that produces a verifiable allow/deny decision before any tool call fires, with no probabilistic components.

Workflow Invariants Roadmap

Declare required steps in a workflow. Prevent agents from skipping approval gates, notification steps, or compliance checkpoints.

Cross-Agent Approval Gates Roadmap

Require a second agent, human-in-the-loop confirmation, or external approval before high-impact actions execute.

Follow progress: github.com/kdcube/kdcube-ai-app

KDCube vs. the Alternatives #

KDCube is the only self-hosted agentic runtime with built-in multi-tenant policy enforcement, per-tenant economics, and dual-protocol streaming — no cloud lock-in required.

Feature comparison: KDCube vs. LangGraph Platform vs. CrewAI vs. AutoGen vs. OpenAI Assistants API vs. AWS Bedrock Agents vs. Vertex AI Agent Engine vs. AgentScope Runtime (March 2026). ✅ = native built-in; Partial = partially available or cloud-only; ✗ = not natively available; ❓ = unclear.
Feature	KDCube self-hosted runtime	LangGraph Platform stateful graph exec	CrewAI multi-agent orch.	AutoGen / AG2 MS multi-agent	OpenAI Assistants cloud-hosted runtime	AWS Bedrock Agents cloud-hosted runtime	Vertex AI Agents GCP cloud runtime	AgentScope Runtime distributed multi-agent
Pre-execution policy gate	✅	✗	✗	✗	✗	Partial	Partial	✗
Per-tenant budget caps & rate limits	✅	✗	✗	✗	Partial	✗	✗	✗
Tenant boundary isolation	✅	Partial	✗	✗	✗	Partial	Partial	✗
Subprocess / sandbox isolation	✅	✗	✗	Partial	Partial	Partial	Partial	Partial
Audit trail & decision logging	✅	Partial	✗	✗	Partial	✅	✅	Partial
Real-time streaming (SSE + WebSocket)	✅	Partial	Partial	Partial	✅	Partial	Partial	Partial
Self-hosted / on-premises	✅	Partial	✅	✅	✗	✗	✗	✅
Open-source / auditable	✅	✅	✅	✅	✗	✗	✗	✅
Multi-protocol clients (REST + SSE + WS)	✅	✗	✗	✗	Partial	✗	✗	✗
Built-in knowledge base / RAG	✅	✗	Partial	Partial	✅	✅	✅	Partial
Token / cost accounting per tenant	✅	✗	✗	✗	Partial	Partial	Partial	✗
Multi-model routing (OpenAI + Anthropic + Gemini)	✅	✅	✅	✅	✗	Partial	Partial	✅
MCP tool integration	✅	Partial	Partial	Partial	✗	✗	✗	❓
Citation / provenance tracking	✅	✗	✗	✗	Partial	Partial	Partial	✗
Structured feedback & quality signals (user reactions + machine Gate Agent, confidence ≥ 0.70)	✅	✗	✗	✗	Partial	✗	✗	✗
Skills system (Anthropic-compatible SKILL.md, namespaced reusable behaviors)	✅	✗	Partial	Partial	Partial	✗	✗	✗
Hot-loadable bundle plugins (no platform restart required)	✅	✗	✗	✗	✗	✗	✗	✗
Channeled multi-stream output (thinking / answer / followup channels over single token stream)	✅	✗	✗	✗	Partial	✗	✗	✗

Build AI that doesn't break trust

Deploy runtime controls in under an hour. Review the code, run it in your environment, and evaluate the enforcement layer directly.

Get Started on GitHub Schedule demo

MIT Licensed · Self-Hosted · Open Source