KDCube Agentic Prototyping Platform

Self-hosted platform + SDK for building and operating agentic chat applications and copilots. Production-ready runtime with streaming, tool execution, memory, economics, and enterprise governance—all out of the box.

Introduction

KDCube APP is a self-hosted platform + SDK for building and operating agentic chat applications and copilots. It ships the full stack: streaming runtime, tool execution, memory/context, economics, and a hosting platform for multi-tenant production deployment.

What You Get Out of the Box

Streaming Chat

REST, SSE, and Socket.IO with steps/deltas/status events and role-based streaming filtering

Tool Execution

Local tools, MCP tools, and skills (built-in and custom) with easy wiring

Code Execution

Isolated Docker runtime for untrusted code generation and execution

ReAct Strategic Solver

Skills acquisition and exploitation, adaptive agent selection, planning, and tool-first/code-first flows

Web Search

Multi-backend search agent with Brave and DuckDuckGo integration

Knowledge Base

Ingestion, hybrid search, and citations with pgvector integration

Memory & Context

Retrieval, turn memories, conversation memories, user-level memories, signals, and reconciliation

Economics & Accounting

Budgets, rate limits, usage reporting, and cost tracking per tenant/project/user

Framework-Agnostic: Build your workflows in LangGraph, LangChain, CrewAI, AutoGen, or custom Python. KDCube provides the runtime infrastructure to host, scale, and govern them.

Architecture & Components

KDCube APP consists of two main layers: the SDK for building agent applications and the Platform for hosting and scaling them in production.

System Architecture

graph TD %% Entry / Auth UI[Web UI / Client] -->|HTTPS + masked cookie| NGINX[Web Proxy / Nginx] AUTH["ProxyLogin (Delegated Auth + 2FA)"] -->|token exchange| NGINX NGINX -->|real auth/id cookies| GATE[Chat API + Gateway] KB[Knowledge Base Service] --> GATE CP[Control Plane / Project Mgmt] --> GATE %% Transport + Gateway NGINX -->|SSE / Socket.IO| GATE NGINX -->|REST| GATE GATE -->|session mgmt| SESS[Session Manager] GATE -->|rate limit/backpressure| GW[Gateway + Throttling] %% Queue + Processing GATE -->|enqueue| Q[Redis Queues] Q --> PROC[Chat Processor Workers] %% Orchestration PROC --> BUNDLES[Dynamic Bundles / Workflows] BUNDLES -->|events| RELAY[ChatRelay + Redis Pub/Sub] RELAY -->|fan-out| GATE %% Context management BUNDLES --> CTX[Context Management] CTX -->|storage| PG["(Postgres RDS)"] CTX -->|artifacts| S3[(S3)] KB -->|storage| PG KB -->|artifacts| S3 CP -->|policies + quotas| PG %% Runtime + providers BUNDLES --> RT["Runtime (LLM + Tools)"] RT --> DOCKER[Ephemeral Docker Exec] RT --> TOOLS[External Tools / APIs] subgraph EXTPROV[External Providers] OAI[OpenAI] ANTH[Anthropic] GEM[Gemini] BRAVE[Brave Search] DDG[DuckDuckGo] end RT --> OAI RT --> ANTH RT --> GEM RT --> BRAVE RT --> DDG %% Cache/Queues/PubSub BUNDLES -->|cache/queues/pubsub| REDIS["(Redis / ElastiCache)"] classDef aws fill:#e8f4ff,stroke:#7aa7d6,color:#0b2b4f; classDef ext fill:#f2f7ee,stroke:#8fbf7a,color:#1f3b1c; classDef infra fill:#f7f2ff,stroke:#b69ad6,color:#2b1b4f; class PG,REDIS,S3 aws; class OAI,ANTH,GEM,BRAVE,DDG,TOOLS ext; class AUTH infra;

SDK Components (Build)

Agent Runtime

  • ReAct agent patterns
  • Planning & orchestration
  • Tool-first and code-first flows
  • Adaptive agent selection

Streaming Channels

  • REST + SSE + Socket.IO
  • Token/step events
  • Role-based filtering
  • Redis Pub/Sub relay

Tools & Skills

  • Local tools integration
  • MCP tools support
  • Built-in and custom skills
  • Easy custom wiring

Memory & Context

  • Turn memories
  • Conversation memories
  • User-level memories
  • Signals and reconciliation

Code Execution

  • Isolated Docker runtime
  • Untrusted code support
  • Resource limits
  • Security sandboxing

Attachments & Artifacts

  • File upload handling
  • Generated files
  • Security checks
  • Storage integration (S3)

Economics & Accounting

  • Usage tracking
  • Budget enforcement
  • Rate limits
  • Cost reporting

Bundle API

  • LangGraph workflows
  • LangChain integration
  • Custom Python apps
  • Dynamic registration

Platform Components (Host & Scale)

Multi-Tenant Isolation

  • Per-tenant/project schemas
  • Storage segregation
  • Namespace separation
  • Policy enforcement

Gateway

  • Authentication (Cognito/SimpleIDP)
  • Rate limiting
  • Backpressure control
  • Circuit breakers

Knowledge Base

  • Document ingestion
  • pgvector embeddings
  • Hybrid search
  • Citation tracking

Dynamic UI Widgets

  • Monitoring dashboards
  • Control plane UI
  • Conversation browser
  • Spending reports

Horizontal Scaling

  • Stateless web service
  • Queue/processor model
  • Redis relay fan-out
  • Load balancing

Storage Layer

  • Postgres (RDS)
  • Redis (ElastiCache)
  • S3 (artifacts)
  • Neo4j (optional)

Bundles: Deployable agent apps. Multiple bundles can be registered and selected per message. One bundle executes per request; different requests can target different bundles.

Key Features

Streaming Runtime

KDCube supports three client transports for maximum flexibility:

Streaming Flow

sequenceDiagram participant UI as Client UI participant API as Chat API participant RL as Redis Relay participant Q as Redis Queue participant W as Worker / Bundle UI->>API: open stream (SSE / Socket.IO connect) UI->>API: send message (SSE / Socket.IO) API->>Q: enqueue task (per user_type queue) W->>Q: dequeue + lock W->>RL: publish chat_* events to session channel RL-->>API: fan-out to connected stream API-->>UI: chat_start/step/delta/complete

Multi-Tenancy & Storage

Production-grade isolation for enterprise deployments:

Economics & Cost Management

Enterprise-grade cost control and reporting:

Security & Sandboxing

Mission-critical security features:

Knowledge Base & RAG

Production-ready retrieval-augmented generation:

Getting Started

Get up and running with KDCube APP in minutes using one of these quickstart options:

Option 1: CLI Installer (Recommended)

The fastest way to get started with guided installation:

# Install the CLI
pip install kdcube-ai-app

# Run the interactive installer
kdcube-install

# Follow the prompts to configure:
# - Database connection (Postgres)
# - Redis connection
# - S3 storage (optional)
# - LLM API keys
# - Authentication provider

The CLI installer handles:

Option 2: Docker Compose

Run the complete stack locally with Docker:

# Clone the repository
git clone https://github.com/elenaviter/kdcube-ai-app.git
cd kdcube-ai-app

# Copy environment template
cp .env.example .env

# Edit .env with your configuration
# - Set database credentials
# - Add LLM API keys (OpenAI, Anthropic, etc.)
# - Configure storage settings

# Start all services
docker-compose -f app/ai-app/deployment/docker/all_in_one/docker-compose.yml up -d

# View logs
docker-compose logs -f

# Access the UI at http://localhost:8080

Initial Configuration

After installation, configure the platform:

  1. Set up LLM providers: Add API keys for OpenAI, Anthropic, or Gemini
  2. Configure authentication: Set up Cognito or use SimpleIDP for development
  3. Create first project: Use the CLI or admin API to create a tenant/project
  4. Upload knowledge: Ingest documents into the knowledge base
  5. Register bundles: Deploy your first agent workflow

Development vs. Production: Use local Postgres and Redis for development. For production, use managed services (RDS, ElastiCache, S3) for better reliability and scaling.

Deployment Options

KDCube APP supports multiple deployment models to fit your infrastructure requirements:

Docker Compose (Development)

Perfect for local development and testing:

Kubernetes (Production)

Enterprise-grade production deployment:

AWS Deployment

Fully managed AWS infrastructure:

Storage Configuration

Component Development Production
Database Local Postgres RDS Postgres (Multi-AZ)
Cache Local Redis ElastiCache Redis (Cluster mode)
Storage File system S3 (versioning enabled)
Auth SimpleIDP Cognito with MFA
Graph DB Optional (disabled) Neo4j (optional)

Scaling Considerations

Design for horizontal scaling:

Agent Definitions

KDCube includes reference implementations demonstrating platform capabilities. These agents can be forked and customized for your use cases:

Conversation Mapping Agent

Conversation Mapping

Analyze and visualize conversation flows with turn-by-turn mapping and pattern detection.

Error Tracking Agent

Error Tracking

Monitor, analyze, and report system errors with automatic categorization and alerting.

Focus Areas Agent

Identifying Focus Areas

Extract key topics and focus areas from conversations using semantic analysis.

Benchmark Builder Agent

Benchmark Builder

Create and manage evaluation benchmarks for model performance testing.

Model Benchmarking Agent

Model Benchmarking

Compare LLM performance across metrics with automated testing and reporting.

LLM Distillation Agent

LLM Distillation

Train smaller models from larger ones using knowledge distillation techniques.

Continuous Refinement Agent

Continuous Refinement

Iteratively improve agent responses using feedback loops and learning.

Knowledge Analyst Agent

Knowledge Analyst

Enterprise RAG agent with hybrid search, citations, and audit trails.

Code Execution Agent

Code Execution

Execute Python code in sandboxed Docker containers with resource limits.

Web Research Agent

Web Research

Multi-backend web search with Brave and DuckDuckGo integration.

Strategic Solver Agent

Strategic Solver

ReAct agent with planning, tool selection, and adaptive reasoning.

Document Processing Agent

Document Processing

Extract, analyze, and process documents with multi-format support.

Marketing Writer Agent

Marketing Writer

Generate marketing content with brand compliance and output validation.

Customization: All agent implementations are open source. Fork them to create specialized agents for your domain, or use them as templates for building new agents from scratch.

API & Integration

Client Transports

Connect to KDCube using multiple transport protocols:

SSE (Server-Sent Events)

// Connect to SSE stream
const eventSource = new EventSource('/sse/stream?session_id=xxx');

eventSource.addEventListener('chat_start', (e) => {
    const data = JSON.parse(e.data);
    console.log('Chat started:', data);
});

eventSource.addEventListener('chat_delta', (e) => {
    const data = JSON.parse(e.data);
    console.log('Token:', data.content);
});

eventSource.addEventListener('chat_complete', (e) => {
    const data = JSON.parse(e.data);
    console.log('Chat completed:', data);
});

Socket.IO

// Connect to Socket.IO
const socket = io('http://localhost:8080', {
    auth: { token: 'your-auth-token' }
});

socket.on('connect', () => {
    console.log('Connected');
});

socket.emit('chat_message', {
    content: 'Hello!',
    bundle_id: 'my-agent'
});

socket.on('chat_start', (data) => {
    console.log('Chat started:', data);
});

socket.on('chat_delta', (data) => {
    console.log('Token:', data.content);
});

socket.on('chat_complete', (data) => {
    console.log('Chat completed:', data);
});

REST API

// Send chat message via REST
fetch('/api/chat/send', {
    method: 'POST',
    headers: {
        'Content-Type': 'application/json',
        'Authorization': 'Bearer your-token'
    },
    body: JSON.stringify({
        content: 'Hello!',
        bundle_id: 'my-agent',
        project_id: 'my-project'
    })
})
.then(response => response.json())
.then(data => console.log(data));

Authentication

KDCube supports multiple authentication methods:

Bundle API

Register custom agent workflows as bundles:

from kdcube_ai_app.infra.plugin import Bundle

class MyAgentBundle(Bundle):
    def __init__(self):
        super().__init__(
            bundle_id="my-agent",
            name="My Custom Agent",
            version="1.0.0"
        )
    
    async def process(self, message, context):
        # Your agent logic here
        response = await self.llm.generate(message)
        
        # Emit streaming events
        await self.emit('chat_delta', {'content': response})
        
        return {'response': response}

# Register the bundle
registry.register(MyAgentBundle())

Security & Governance

Multi-Tenant Isolation

KDCube provides enterprise-grade isolation:

Authentication Model

Secure, delegated authentication with hosted 2FA:

Gateway & Rate Limiting

Protect your infrastructure with multiple layers of control:

Gateway Layer

Economics Layer

Code Execution Sandbox

Mission-critical isolation for untrusted code:

Audit & Compliance

Comprehensive logging for regulatory compliance:

Advanced Topics

Dynamic Bundles

Bundles are runtime-loadable workflows with custom logic:

Context Management

Sophisticated context reconciliation:

Horizontal Scaling

Scale to thousands of concurrent users:

Observability & Monitoring

Production-grade monitoring and debugging:

Documentation Resources

Comprehensive documentation for developers and operators:

Platform Documentation

SDK & Development

Deployment Guides

Examples & Tutorials

Community Support: Join our GitHub discussions for questions, feature requests, and community contributions. For enterprise support including SLA guarantees and custom development, contact us at info@kdcube.tech.

Ready to Get Started?

Deploy your first agent in under 30 minutes