For the full interactive reference with code examples and comparison tables, see ReAct v2 — Full Reference.
ReAct Agent (V2)
The ReAct V2 agent is a single autonomous loop — no separate planner, no gate. It starts in runtime.py as a ReactStateV2 with up to max_iterations=15 rounds by default. In each round the agent makes a decision (LLM call) then executes one or more tools. Planning is available as a tool the agent can call on itself — not a separate orchestration layer.
PlanSnapshot that is tracked as a react.plan block in the timeline. On subsequent rounds the agent can update step statuses (✓ done, ✗ failed, … in-progress). This is all within the same loop — no coordinator needed, and no cache miss from a different system prompt.
ReAct Agent V2 — Loop & Tool Integration
Creating and Running the ReAct Agent
# In your workflow (BaseWorkflow subclass)
react = await self.build_react(
scratchpad=scratchpad,
tools_module="my_bundle.tools_descriptor",
skills_module="my_bundle.skills_descriptor",
knowledge_space_fn=self._search_knowledge, # optional
knowledge_read_fn=self._read_knowledge, # optional
)
result = await react.run(payload)
Timeline
The timeline (timeline.py) is the single source of truth for turn context. It is persisted as artifact:conv.timeline.v1 and loaded at the start of each turn. A separate artifact:conv:sources_pool tracks all sources referenced in the conversation.
Cache checkpoints are computed by rounds (tool call rounds + final completion). They allow LLM context caching to skip retokenizing earlier parts of long conversations. See timeline-README.md, source-pool-README.md, and react-announce-README.md.
Multi-Channel Streaming
ReAct Agent Documentation
Full docs live in docs/sdk/agents/react/. Key files:
Architecture & Flow
Timeline & Artifacts
- timeline-README.md — blocks, cache points, compaction
- react-announce-README.md — announce channel
- source-pool-README.md
- artifact-storage-README.md
- artifact-discovery-README.md
- conversation-artifacts-README.md
Plan Tracking
Plans are a first-class timeline concept, not a separate orchestration layer. The agent creates and manages plans through the react.plan tool, and every plan is persisted as an append-only sequence of react.plan snapshot blocks in the timeline.
PlanSnapshot Structure
Each plan snapshot is stored as a timeline block of type react.plan with a stable plan_id and ordered steps. Key fields:
| Field | Description |
|---|---|
plan_id | Stable identifier for the plan lineage (opaque string) |
steps | Ordered list of step descriptions |
status | Current plan status |
origin_turn_id | Turn where the plan was first created |
last_turn_id | Turn of the most recent update |
closed_ts / superseded_ts | Terminal timestamps (set when plan is closed or replaced) |
The react.plan Tool
The agent manages plans through four lifecycle modes:
mode="new"
Creates a fresh plan lineage with a new plan_id and ordered steps. Becomes the current plan immediately and appears in ANNOUNCE.
mode="replace"
Retires an existing plan (marks it superseded) and creates a new lineage as its replacement. The old plan disappears from the open-plans view.
mode="activate"
Re-activates an older open plan as the current plan. Does not create a new plan_id. Progress acknowledgements apply only to the current plan.
mode="close"
Terminates a plan without replacement. The lineage stays in history but disappears from ANNOUNCE.
Plan Block in Timeline
Plans appear in the timeline as react.plan blocks with a stable reread handle:
# Stable latest-snapshot alias for any plan lineage
ar:plan.latest:<plan_id>
# Model creates a plan
react.plan(mode="new", steps=["collect metrics", "compare trends", "draft answer"])
# ANNOUNCE shows open plans with step markers
# [OPEN PLANS]
# plan_id=plan_alpha (current)
# □ [1] collect metrics
# □ [2] compare trends
# □ [3] draft answer
Step Statuses
The agent reports step progress via notes using status markers. The runtime parses these markers and updates the plan snapshot automatically.
| Marker | Status | Meaning |
|---|---|---|
✓ [n] | Done | Step completed successfully |
✗ [n] | Failed | Step failed or was abandoned |
… [n] | In-progress | Step is currently being worked on |
□ [n] | Pending | Step not yet started (default) |
react.plan(mode="activate"|"replace"|"close"), it should acknowledge progress in a later round, not the same one.
Multi-Round Plan Tracking
Plans survive across rounds and turns through the following mechanisms:
- ANNOUNCE lists the last 4 open plans each round, marking the current one explicitly with
(current). - The stable alias
ar:plan.latest:<plan_id>always resolves to the newest snapshot for a lineage, regardless of which turn last updated it. - On a new turn, the runtime rehydrates only the current open plan automatically. Older plans must be inspected explicitly via
react.readif they become relevant again. - When history is compacted, older plans appear in a
react.plan.historyblock with step skeletons, statuses, and stablesnapshot_refs for recovery.
A plan lineage is considered open only if its latest snapshot is not closed, superseded, or complete. Only the plan tagged (current) in ANNOUNCE may receive step acknowledgements.
See plan-README.md
Isolated Execution Runtime
The platform provides a sandboxed code execution runtime — your agent can generate and run arbitrary Python code in complete isolation. The runtime has a clear two-zone model:
- Supervisor — networked, has env secrets and full runtime context. All bundle tools from
tools_descriptor.pyexecute here, including MCP tools, bundle-local tools, and custom SDK tools. TheChatCommunicatoris also available to tool code, streaming events via Redis Pub/Sub to the client SSE. - Executor — completely isolated: no network, no env secrets, separate Linux namespace (UID 1001). Runs LLM-generated code. All tool calls are proxied to the Supervisor over a Unix socket. Can only write to
/workspace/workand/workspace/out.
Two execution backends are practical:
🐳 Docker Default
Runs code in an isolated Docker container on the same EC2 host as the Processor. Low latency, ideal for interactive agentic loops. The container shares the host's Docker daemon — fast spin-up, full isolation.
execution:
runtime:
mode: "docker"
enabled: true # default
☁️ AWS Fargate Async only
Serverless container on a separate compute plane. Recommended for long-running, non-live workloads — batch data processing, heavy computation, report generation — where startup latency (10–30s) is acceptable. Not suitable for fast interactive agentic loops.
execution:
runtime:
mode: "fargate"
cluster: "arn:aws:ecs:..."
task_definition: "exec-task"
Exec Environment Variables (Inside Executed Code)
| Variable | Description |
|---|---|
WORKDIR | Working directory (source, helpers) |
OUTPUT_DIR | Output directory (write files here) |
EXECUTION_ID | Unique execution identifier |
RUNTIME_GLOBALS_JSON | Serialized runtime context (tools, state) |
RUNTIME_TOOL_MODULES | Tool module names available |
BUNDLE_ROOT | Bundle root path (access your bundle files) |
BUNDLE_ID | Current bundle ID |
Supervisor vs Executor Architecture
The execution runtime uses a strict two-process model within a single container (Docker or Fargate):
- The Supervisor (PID 1) bootstraps the full runtime: loads dynamic tool modules, initializes ModelService, KB client, Redis communication, and starts a
PrivilegedSupervisorlistening on/tmp/supervisor.sock. - The Executor subprocess drops privileges to UID 1001, optionally calls
unshare(CLONE_NEWNET)for network isolation, and runs the LLM-generateduser_code.py. - Every tool call from executor code (
io_tools,web_tools,react_tools, etc.) is proxied over the Unix socket to the supervisor. The executor never has direct access to network, secrets, or databases.
Docker Execution Mode
unshare(CLONE_NEWNET). Docker mode supports custom images, CPU/memory limits, and PID limits via bundle configuration.
# Docker profile in bundle props
execution:
runtime:
profiles:
docker:
mode: "docker"
image: "py-code-exec:latest"
network_mode: "host"
cpus: "1.5"
memory: "2g"
extra_args: ["--pids-limit", "256"]
Fargate Execution Mode
Fargate exec runs the same supervisor/executor architecture as Docker, but on a dedicated ECS Fargate task instead of a local container. This is the replacement for Docker-on-node in environments where Fargate containers cannot access the Docker daemon.
| Aspect | Docker Mode | Fargate Mode |
|---|---|---|
| Startup latency | Sub-second | 10-30 seconds |
| Workdir sharing | Host bind mount | S3 snapshot + restore |
| Network isolation | unshare(CLONE_NEWNET) | Task-level VPC security group |
| Task lifetime | Container exits, docker rm | ECS task STOPPED |
| Caller waits via | proc.communicate() | Poll describe_tasks until STOPPED |
| Best for | Interactive agentic loops | Batch workloads, heavy computation |
The caller (chat-proc) snapshots the workdir and outdir to S3, launches the Fargate task via ecs.run_task, polls until completion, then restores output zips back to the local workspace. From the agent's perspective, the result contract is identical to Docker mode.
Environment Variable Injection
The Fargate task receives its full runtime context via containerOverrides.environment at run_task time. Key variables include RUNTIME_GLOBALS_JSON (serialized runtime context), RUNTIME_TOOL_MODULES, EXEC_SNAPSHOT URIs for workspace restoration, and all proc-level secrets (Postgres, Redis, API keys). Bundle tool module paths are rewritten from host paths to container paths (/workspace/bundles/{bundle_dir}/...).
Network Isolation & Unix Socket Communication
In both Docker and Fargate modes, the executor subprocess is network-isolated. All tool calls from generated code are routed over a Unix domain socket (/tmp/supervisor.sock) to the supervisor process. The supervisor has full access to Redis, Postgres, ModelService, S3, and external APIs. In Fargate, the supervisor connects to backing services via VPC DNS (Cloud Map private DNS or direct ElastiCache/RDS endpoints).
Error Propagation
Runtime-specific failures (ECS startup failure, Fargate timeout, snapshot restore failure) are surfaced through the same report_text / error envelope as local Docker execution. The agent sees a unified result contract regardless of backend:
# Unified result fields (both Docker and Fargate)
ok: bool # execution succeeded
artifacts: list # produced files
error: str # error message if failed
report_text: str # human-readable summary
user_out_tail: str # last lines of user.log
runtime_err_tail: str # last lines of runtime errors
See distributed-exec-README.md and exec-logging-error-propagation-README.md
Knowledge Space
Bundles can expose a searchable knowledge space built from a Git repository's docs, source code, deployment configs, and tests.
return {
"knowledge": {
"repo": "https://github.com/org/repo.git", # "" = local repo
"ref": "main",
"docs_root": "app/docs",
"src_root": "app/src",
"deploy_root": "app/deploy",
"tests_root": "app/tests",
"validate_refs": True
}
}
on_bundle_load()— Builds the index once per process (file-locked, signature-cached)pre_run_hook()— Reconciles if config changed
Agent access via ks: paths: react.search_knowledge(query=..., limit=5) and react.read(["ks:docs/architecture.md"])
Context, RAG & Conversations
Context RAG Client
# self.ctx_client is ContextRAGClient
results = await self.ctx_client.search(
query="previous analysis of sales data",
kind="assistant", # or "user" | "attachment"
limit=5
)
artifact = await self.ctx_client.fetch_ctx(["ar:turn_abc.artifacts.summary"])
Conversations API Endpoints
GET /conversations/{tenant}/{project}
POST /conversations/{tenant}/{project}/fetch
POST /conversations/{tenant}/{project}/{conv_id}/turns-with-feedbacks
POST /conversations/{tenant}/{project}/feedback/conversations-in-period
The react.memsearch tool provides vector search in past turns directly inside the agent loop. The ConversationStore (accessible via BaseWorkflow.store) manages turn payloads, timelines, and artifacts.
Timeline & Context Layout
Each conversation maintains a rolling timeline of turn artifacts stored as artifact:conv.timeline.v1. The timeline is the canonical cross-turn context passed to the LLM. It is structured as an ordered sequence of turn records, each containing user input, assistant output, tool calls, and any attached artifacts.
Cache Points
The platform inserts up to three LLM-level cache checkpoints per turn: prev-turn (the end of the prior turn), pre-tail (just before the current turn's tail), and tail (after the current turn). These cache points allow the LLM inference layer to reuse context prefix KV-cache across turns, reducing both latency and token cost for multi-turn conversations.
Compaction
When the accumulated timeline approaches the configured context budget ceiling, the platform triggers compaction: older turn ranges are summarized into a compact conv.range.summary artifact and replaced in the timeline. This is a hard-ceiling guard — it ensures context never silently overflows the model's context window. Compaction is transparent to bundle code.
Hosting & File Resources
Your bundle can produce files (PDFs, PNGs, data exports) and make them available via hosted URLs. The platform handles upload, serving, and access control automatically.
# ApplicationHostingService (via BaseWorkflow.hosting_service)
url = hosting.get_artifact_url("fi:turn_123.outputs/export/report.pdf")
# Resource Name format
# ef:{tenant}:{project}:chatbot:{stage}:{user_id}:{conv_id}:{turn_id}:{role}:{path}
# Resolved by POST /by-rn with authentication enforced by platform
Files written to OUTPUT_DIR/turn_{id}/files/ remain part of the durable workspace tree, while files written to OUTPUT_DIR/turn_{id}/outputs/ are tracked as non-workspace produced artifacts. User-facing downloads should typically come from outputs/ with external visibility. User attachments appear as fi:{turn_id}.user.attachments/{filename}.
Attachments & Limits
User-uploaded files enter the system via the chat API (SSE or Socket.IO), pass through security scanning, are stored in the ConversationStore, and then flow to two downstream paths: multimodal LLM inference (as base64 blocks) and code execution (as rehosted files in the workspace).
User Upload Flow
When a user submits attachments, the ingress layer enforces size caps and runs security preflight before storage:
- Collect raw bytes + metadata (filename, MIME type)
- Enforce per-file and total-message size caps
- Run ClamAV antivirus scan (when
APP_AV_SCAN=1, always enabled in production) - Run preflight validation: MIME-type allowlist via magic sniffing, PDF heuristic checks, ZIP/OOXML structural checks, macro blocking
- If allowed, store via
ConversationStore.put_attachment()
.docm, .pptm, VBA projects) is rejected at ingress. Generic ZIP archives are also disallowed by default.
Supported File Types
| Category | Accepted Types |
|---|---|
| Documents | application/pdf, .docx, .pptx, .xlsx |
| Images | image/jpeg, image/png, image/gif, image/webp |
| Text | text/* (subject to size limit) |
File Rehosting for Execution
For code-generated programs, attachments are rehosted into the execution workspace at workdir/<turn_id>/attachments/<filename>, so generated code can read them as local files inside the sandboxed container.
Artifact Size & Count Limits
| Limit | Value |
|---|---|
| Per-image cap | 5 MB (MODALITY_MAX_IMAGE_BYTES) |
| Per-PDF cap | 10 MB (MODALITY_MAX_DOC_BYTES) |
| Total message cap (text + attachments) | 25 MB (MESSAGE_MAX_BYTES) |
| PDF max pages | 500 |
| ZIP max entries | 2,000 |
| ZIP max uncompressed total | 120 MB |
| ZIP max compression ratio | 200x |
| Text file max size | 10 MB |
Timeline Truncation Limits
To prevent context blowup, the platform applies truncation policies to older timeline blocks:
| Limit | Default |
|---|---|
| User/assistant text truncation | 4,000 chars |
| Tool result text truncation | 400 chars |
| Tool result list items cap | 50 items |
| Tool result dict keys cap | 80 keys |
| Base64 in timeline blocks | 4,000 chars (oversized replaced with placeholder) |
| Sources pool base64 cap | 4,000 chars (dropped if exceeded) |
react.read to rehydrate hidden or pruned artifacts when needed. Skills loaded by react.read are pruned in old turns with a placeholder containing the original sk: reference for re-reading.
Citations & Sources
Citation Tokens
The company was founded in 2015 [[S:1]] and expanded by 2020 [[S:2,3]].
According to multiple sources [[S:1-4]], the trend is clear.
Sources Pool Fields
| Field | Description |
|---|---|
sid | Source ID (integer, per-conversation, deduplicated) |
title | Page or file title |
url | URL or file path |
source_type | web | file | attachment | manual |
objective_relevance | Semantic relevance score (0–1) |
published_time_iso | Publication timestamp |
favicon_url | Source favicon for UI display |
Feedback System
POST /conversations/{tenant}/{project}/{conv_id}/turns/{turn_id}/feedback
{ "reaction": "ok", "text": "Very helpful!", "ts": "2026-03-21T10:00:00Z" }
# reaction: ok | not_ok | neutral | null
Your bundle can also emit machine feedback (origin: "machine") for confidence scores or quality checks — additive, not replacing user feedback. Satisfaction rate: ok / (ok + not_ok + neutral).