KDCube Agentic Prototyping Platform
Self-hosted platform + SDK for building and operating agentic chat applications and copilots. Production-ready runtime with streaming, tool execution, memory, economics, and enterprise governance—all out of the box.
Introduction
KDCube APP is a self-hosted platform + SDK for building and operating agentic chat applications and copilots. It ships the full stack: streaming runtime, tool execution, memory/context, economics, and a hosting platform for multi-tenant production deployment.
What You Get Out of the Box
Streaming Chat
REST, SSE, and Socket.IO with steps/deltas/status events and role-based streaming filtering
Tool Execution
Local tools, MCP tools, and skills (built-in and custom) with easy wiring
Code Execution
Isolated Docker runtime for untrusted code generation and execution
ReAct Strategic Solver
Skills acquisition and exploitation, adaptive agent selection, planning, and tool-first/code-first flows
Web Search
Multi-backend search agent with Brave and DuckDuckGo integration
Knowledge Base
Ingestion, hybrid search, and citations with pgvector integration
Memory & Context
Retrieval, turn memories, conversation memories, user-level memories, signals, and reconciliation
Economics & Accounting
Budgets, rate limits, usage reporting, and cost tracking per tenant/project/user
Framework-Agnostic: Build your workflows in LangGraph, LangChain, CrewAI, AutoGen, or custom Python. KDCube provides the runtime infrastructure to host, scale, and govern them.
Architecture & Components
KDCube APP consists of two main layers: the SDK for building agent applications and the Platform for hosting and scaling them in production.
System Architecture
SDK Components (Build)
Agent Runtime
- ReAct agent patterns
- Planning & orchestration
- Tool-first and code-first flows
- Adaptive agent selection
Streaming Channels
- REST + SSE + Socket.IO
- Token/step events
- Role-based filtering
- Redis Pub/Sub relay
Tools & Skills
- Local tools integration
- MCP tools support
- Built-in and custom skills
- Easy custom wiring
Memory & Context
- Turn memories
- Conversation memories
- User-level memories
- Signals and reconciliation
Code Execution
- Isolated Docker runtime
- Untrusted code support
- Resource limits
- Security sandboxing
Attachments & Artifacts
- File upload handling
- Generated files
- Security checks
- Storage integration (S3)
Economics & Accounting
- Usage tracking
- Budget enforcement
- Rate limits
- Cost reporting
Bundle API
- LangGraph workflows
- LangChain integration
- Custom Python apps
- Dynamic registration
Platform Components (Host & Scale)
Multi-Tenant Isolation
- Per-tenant/project schemas
- Storage segregation
- Namespace separation
- Policy enforcement
Gateway
- Authentication (Cognito/SimpleIDP)
- Rate limiting
- Backpressure control
- Circuit breakers
Knowledge Base
- Document ingestion
- pgvector embeddings
- Hybrid search
- Citation tracking
Dynamic UI Widgets
- Monitoring dashboards
- Control plane UI
- Conversation browser
- Spending reports
Horizontal Scaling
- Stateless web service
- Queue/processor model
- Redis relay fan-out
- Load balancing
Storage Layer
- Postgres (RDS)
- Redis (ElastiCache)
- S3 (artifacts)
- Neo4j (optional)
Bundles: Deployable agent apps. Multiple bundles can be registered and selected per message. One bundle executes per request; different requests can target different bundles.
Key Features
Streaming Runtime
KDCube supports three client transports for maximum flexibility:
- SSE (Server-Sent Events): Primary streaming transport, default in current UI
- Socket.IO: Fully supported alternative with bidirectional communication
- REST: Non-streaming endpoints for profile, admin, and monitoring operations
Streaming Flow
Multi-Tenancy & Storage
Production-grade isolation for enterprise deployments:
- Postgres: Per-tenant and per-project schemas (prod/dev separated) with control_plane schema for policies and quotas
- S3: Bucket per tenant/project or shared bucket with prefix segmentation for artifacts and documents
- Redis: Cache, messaging (Pub/Sub), and rate-limit counters
- Neo4j: Optional graph database (currently disabled)
Economics & Cost Management
Enterprise-grade cost control and reporting:
- Gateway rate limiting: Protect system resources with configurable limits
- Economics rate limiting: Tier policies, per-user quotas, and concurrency locks
- Budget enforcement: Per-tenant, per-project, and per-pipeline budget controls
- Usage tracking: Token-level granularity with real-time dashboards
- Cost reporting: Comprehensive spending analysis and aggregations
Security & Sandboxing
Mission-critical security features:
- Delegated authentication: ProxyLogin with hosted 2FA support
- Cookie-based auth: Secure token exchange with masked cookies
- Docker isolation: Ephemeral containers for code execution
- Network isolation: Tool calls proxied through supervisor
- Resource limits: CPU, memory, and execution time constraints
- Audit trails: Comprehensive logging for compliance
Knowledge Base & RAG
Production-ready retrieval-augmented generation:
- Document ingestion: Support for multiple file formats (PDF, DOCX, TXT, etc.)
- Hybrid search: Combine vector similarity with keyword search
- pgvector integration: Scalable vector database on Postgres
- Citation tracking: Structured citations with source tracking and in-stream rendering
- Reranking: Improve retrieval quality with reranking models
- Multi-tenant: Isolated knowledge bases per tenant/project
Getting Started
Get up and running with KDCube APP in minutes using one of these quickstart options:
Option 1: CLI Installer (Recommended)
The fastest way to get started with guided installation:
# Install the CLI
pip install kdcube-ai-app
# Run the interactive installer
kdcube-install
# Follow the prompts to configure:
# - Database connection (Postgres)
# - Redis connection
# - S3 storage (optional)
# - LLM API keys
# - Authentication provider
The CLI installer handles:
- Database schema deployment
- Environment configuration
- Service initialization
- Health checks and validation
Option 2: Docker Compose
Run the complete stack locally with Docker:
# Clone the repository
git clone https://github.com/elenaviter/kdcube-ai-app.git
cd kdcube-ai-app
# Copy environment template
cp .env.example .env
# Edit .env with your configuration
# - Set database credentials
# - Add LLM API keys (OpenAI, Anthropic, etc.)
# - Configure storage settings
# Start all services
docker-compose -f app/ai-app/deployment/docker/all_in_one/docker-compose.yml up -d
# View logs
docker-compose logs -f
# Access the UI at http://localhost:8080
Initial Configuration
After installation, configure the platform:
- Set up LLM providers: Add API keys for OpenAI, Anthropic, or Gemini
- Configure authentication: Set up Cognito or use SimpleIDP for development
- Create first project: Use the CLI or admin API to create a tenant/project
- Upload knowledge: Ingest documents into the knowledge base
- Register bundles: Deploy your first agent workflow
Development vs. Production: Use local Postgres and Redis for development. For production, use managed services (RDS, ElastiCache, S3) for better reliability and scaling.
Deployment Options
KDCube APP supports multiple deployment models to fit your infrastructure requirements:
Docker Compose (Development)
Perfect for local development and testing:
- All services run in containers
- Local Postgres and Redis
- File system storage
- SimpleIDP authentication
- Ideal for: Development, testing, demos
Kubernetes (Production)
Enterprise-grade production deployment:
- Horizontal scaling of workers and API
- High availability and fault tolerance
- Health checks and auto-healing
- Rolling updates with zero downtime
- Ideal for: Production, multi-tenant SaaS
AWS Deployment
Fully managed AWS infrastructure:
- Compute: ECS Fargate for stateless services
- Database: RDS Postgres with Multi-AZ
- Cache: ElastiCache Redis
- Storage: S3 for artifacts
- Auth: Cognito for user management
- Networking: VPC with private subnets
- Ideal for: Enterprise production, regulated industries
Storage Configuration
| Component | Development | Production |
|---|---|---|
| Database | Local Postgres | RDS Postgres (Multi-AZ) |
| Cache | Local Redis | ElastiCache Redis (Cluster mode) |
| Storage | File system | S3 (versioning enabled) |
| Auth | SimpleIDP | Cognito with MFA |
| Graph DB | Optional (disabled) | Neo4j (optional) |
Scaling Considerations
Design for horizontal scaling:
- Stateless API: Scale web tier independently based on traffic
- Worker pools: Scale processor workers based on queue depth
- Redis relay: Fan-out pattern supports distributed connections
- Database: Read replicas for read-heavy workloads
- Storage: S3 automatically scales with demand
Agent Definitions
KDCube includes reference implementations demonstrating platform capabilities. These agents can be forked and customized for your use cases:
Conversation Mapping
Analyze and visualize conversation flows with turn-by-turn mapping and pattern detection.
Error Tracking
Monitor, analyze, and report system errors with automatic categorization and alerting.
Identifying Focus Areas
Extract key topics and focus areas from conversations using semantic analysis.
Benchmark Builder
Create and manage evaluation benchmarks for model performance testing.
Model Benchmarking
Compare LLM performance across metrics with automated testing and reporting.
LLM Distillation
Train smaller models from larger ones using knowledge distillation techniques.
Continuous Refinement
Iteratively improve agent responses using feedback loops and learning.
Knowledge Analyst
Enterprise RAG agent with hybrid search, citations, and audit trails.
Code Execution
Execute Python code in sandboxed Docker containers with resource limits.
Web Research
Multi-backend web search with Brave and DuckDuckGo integration.
Strategic Solver
ReAct agent with planning, tool selection, and adaptive reasoning.
Document Processing
Extract, analyze, and process documents with multi-format support.
Marketing Writer
Generate marketing content with brand compliance and output validation.
Customization: All agent implementations are open source. Fork them to create specialized agents for your domain, or use them as templates for building new agents from scratch.
API & Integration
Client Transports
Connect to KDCube using multiple transport protocols:
SSE (Server-Sent Events)
// Connect to SSE stream
const eventSource = new EventSource('/sse/stream?session_id=xxx');
eventSource.addEventListener('chat_start', (e) => {
const data = JSON.parse(e.data);
console.log('Chat started:', data);
});
eventSource.addEventListener('chat_delta', (e) => {
const data = JSON.parse(e.data);
console.log('Token:', data.content);
});
eventSource.addEventListener('chat_complete', (e) => {
const data = JSON.parse(e.data);
console.log('Chat completed:', data);
});
Socket.IO
// Connect to Socket.IO
const socket = io('http://localhost:8080', {
auth: { token: 'your-auth-token' }
});
socket.on('connect', () => {
console.log('Connected');
});
socket.emit('chat_message', {
content: 'Hello!',
bundle_id: 'my-agent'
});
socket.on('chat_start', (data) => {
console.log('Chat started:', data);
});
socket.on('chat_delta', (data) => {
console.log('Token:', data.content);
});
socket.on('chat_complete', (data) => {
console.log('Chat completed:', data);
});
REST API
// Send chat message via REST
fetch('/api/chat/send', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': 'Bearer your-token'
},
body: JSON.stringify({
content: 'Hello!',
bundle_id: 'my-agent',
project_id: 'my-project'
})
})
.then(response => response.json())
.then(data => console.log(data));
Authentication
KDCube supports multiple authentication methods:
- Header-based:
Authorization: Bearer <token> - Cookie-based: Secure cookies with infra exchange
- SSE query params:
?token=<token>for compatibility - Socket.IO auth: Auth payload during handshake
Bundle API
Register custom agent workflows as bundles:
from kdcube_ai_app.infra.plugin import Bundle
class MyAgentBundle(Bundle):
def __init__(self):
super().__init__(
bundle_id="my-agent",
name="My Custom Agent",
version="1.0.0"
)
async def process(self, message, context):
# Your agent logic here
response = await self.llm.generate(message)
# Emit streaming events
await self.emit('chat_delta', {'content': response})
return {'response': response}
# Register the bundle
registry.register(MyAgentBundle())
Security & Governance
Multi-Tenant Isolation
KDCube provides enterprise-grade isolation:
- Schema-level isolation: Each tenant/project gets a dedicated Postgres schema
- Storage segmentation: S3 buckets or prefixes per tenant
- Namespace separation: Redis keys namespaced by tenant/project
- Policy enforcement: Control plane manages cross-tenant policies
Authentication Model
Secure, delegated authentication with hosted 2FA:
- Delegated auth: ProxyLogin service handles authentication flow
- Hosted 2FA: Built-in two-factor authentication UI
- Cookie exchange: Secure token exchange with masked cookies
- Session management: Automatic session resolution and upgrade
- Provider support: Cognito (production), SimpleIDP (development)
Gateway & Rate Limiting
Protect your infrastructure with multiple layers of control:
Gateway Layer
- Request rate limiting: Configurable limits per endpoint
- Backpressure: Queue capacity monitoring and rejection
- Circuit breakers: Automatic failure detection and recovery
- Input validation: Message and attachment size limits
Economics Layer
- Tier policies: Different limits per user tier
- Per-user quotas: Individual usage caps
- Concurrency locks: Prevent oversubscription
- Budget enforcement: Hard caps on spending
Code Execution Sandbox
Mission-critical isolation for untrusted code:
- Docker containers: Ephemeral, isolated execution environment
- Network isolation: No direct external network access
- Tool proxy: All tool calls routed through supervisor
- Resource limits: CPU, memory, and execution time constraints
- Privilege separation: Non-root execution
- Data exfiltration prevention: Controlled output channels
Audit & Compliance
Comprehensive logging for regulatory compliance:
- Complete execution lineage: Track every step of agent execution
- Usage logging: All API calls and tool executions logged
- Cost tracking: Per-tenant, per-user spending records
- Access logs: Authentication and authorization events
- Data provenance: Source tracking for all generated content
Advanced Topics
Dynamic Bundles
Bundles are runtime-loadable workflows with custom logic:
- Hot reload: Update agents without restarting services
- Custom endpoints: Expose bundle-specific APIs
- Storage integration: Access knowledge base and context
- Event emission: Stream results via ChatCommunicator
- Framework support: LangGraph, LangChain, or custom Python
Context Management
Sophisticated context reconciliation:
- Turn-ordered memories: Sequential conversation context
- User preferences: Cross-conversation user settings
- Artifact tracking: Files and generated content
- Signal extraction: Automatic pattern detection
- Context reconciliation: Merge and deduplicate context
Horizontal Scaling
Scale to thousands of concurrent users:
- Stateless API: Scale web tier horizontally
- Worker pools: Independent scaling of processor workers
- Queue-based: Fair scheduling across user types
- Redis relay: Session-scoped pub/sub for efficient fan-out
- Load balancing: Distribute requests across instances
Observability & Monitoring
Production-grade monitoring and debugging:
- Health checks: Service health and heartbeat monitoring
- Metrics collection: Request rates, latencies, error rates
- Usage dashboards: Real-time spending and usage visualization
- Conversation browser: Search and replay conversations
- Performance profiling: Identify bottlenecks and optimize
Documentation Resources
Comprehensive documentation for developers and operators:
Platform Documentation
System Architecture
Architecture Overview →Gateway & Policy
Gateway Documentation →Economics System
Economics Guide →Knowledge Base
KB Documentation →SDK & Development
SDK Index
SDK Overview →AI Bundle SDK
Bundle Development →Tool Subsystem
Tools & Runtime →Streaming System
Communication Guide →Deployment Guides
Docker Compose
All-in-One Setup →CLI Installer
CLI Guide →Database Setup
Schema Deployment →Monitoring
Observability Setup →Examples & Tutorials
Example Bundles
Bundle Examples →Streaming Examples
Code Examples →Skills Registry
Skills Guide →Load Testing
Testing Guide →Community Support: Join our GitHub discussions for questions, feature requests, and community contributions. For enterprise support including SLA guarantees and custom development, contact us at info@kdcube.tech.
Ready to Get Started?
Deploy your first agent in under 30 minutes