The rule is simple: generated code gets a small place to work and a narrow way to request approved tools. It does not get the whole runtime. Network access, descriptor-backed configuration, bundle storage, runtime storage, and most logs stay on the trusted supervisor side.
The current shape
Docker split mode runs two containers for an execution: one supervisor container and one executor container. They share only the surfaces needed to complete the request.
What the executor can see
The executor side is not meant to be comfortable as a platform shell. It is meant to be comfortable as a generated-code work area. In split mode it receives:
- a work directory for scratch data;
- an artifact output directory where the requested files are produced;
- executor-local logs for stdout/stderr and generated-code diagnostics;
- a supervisor socket and short-lived execution token for approved tool calls;
- a minimal environment for Python, fonts, plotting, and file generation.
It does not receive platform descriptor payloads, bundle storage mounts, managed-bundle roots, runtime storage roots, cloud credentials, or supervisor logs. The important point is not only "can it write outside the output directory?" but also "can it read outer runtime data?" The split design addresses both.
proc host execution tree
ctx_v2_.../
work/ runtime scratch
out/
workdir/ executor-visible artifact output
logs/
executor/ visible from executor
supervisor/ not mounted into executor
timeline / sources runtime metadata, not executor workspace
ctx_v2_....zip persisted package for later inspection
Why not put every tool inside the executor?
Some tools need network, secrets, browser binaries, bundle configuration, or storage access. If those tools were imported and executed directly by generated code, the isolation boundary would become mostly cosmetic. The executor would need the same authority as the platform.
KDCube keeps the tool implementation on the supervisor side. Generated code can ask for a tool through a stub, but the call crosses an explicit boundary where the runtime can authenticate the request, resolve the allowed alias, execute the tool, log the operation, and return a result.
| Surface | Executor | Supervisor |
|---|---|---|
| Outbound network | No direct path | Available for configured tools |
| Platform descriptors and secrets | Not mounted / not inherited | Available where needed for tool bootstrap |
| Generated files | Writes to artifact output workspace | Validates, packages, and exposes contracted outputs |
| Logs | Sees executor-local logs only | Merges executor, supervisor, Docker, and infra diagnostics for the tool result |
| Rendering and web search | Calls through stubs | Runs the actual network/browser/tool work |
The rejected simpler path
A tempting design is to keep one Docker container and add more in-process sandboxing around the generated-code subprocess. That can work for some environments, but it tends to become fragile: kernel capabilities, mount behavior, browser dependencies, and host-specific runtime settings all leak into the correctness story.
The split container model is less clever and more explicit. The executor container is the untrusted side. The supervisor container is the trusted side. The boundary is visible in Docker arguments, logs, mounts, and tests. That is easier to explain to operators and easier to audit.
What this does not claim
This is not a claim that generated-code execution is risk-free. Container isolation still depends on the container runtime, kernel, host configuration, and deployment policy. The design reduces what generated code can normally read, write, and contact, while preserving the ability to run approved platform tools.
That distinction matters. A useful runtime should not hide failures or silently block diagnostics. The execution result still includes runtime failure summaries, missing artifact reports, program log tails, and infra diagnostics so the agent and operator can understand whether the issue came from user code, a tool, or infrastructure.
Why this fits React
React already separates generation channels, runtime validation, tool dispatch, workspace state, and user-visible progress. The split isolated runtime follows the same philosophy: keep the channel for generated code useful, but do not confuse it with platform authority.
That is the shape we want for production agents: enough flexibility to generate real files and use real tools, enough containment that generated code cannot browse the platform around it, and enough diagnostics that failures remain understandable.