ReAct V3 Agent
The featured ReAct agent is v3: a single autonomous loop with no separate planner and no gate. It preserves the timeline-first runtime model while adding safe multi-action rounds when AI_REACT_AGENT_MULTI_ACTION=safe_fanout is enabled. In normal single-action operation, each round still makes a decision and executes one action: either one tool call or a final-answer action. The base round cap resolves from bundle props config.react.max_iterations / react.max_iterations, then assembly/env ai.react.max_iterations / AI_REACT_MAX_ITERATIONS, then fallback 15. Planning is available as a tool the agent can call on itself, not a separate orchestration layer.
ReAct is also not built around provider-native tool calling as the model contract. The runtime asks for its own generation shape directly, so thinking, decision JSON, raw code, and other channels do not have to be squeezed into assistant/tool/result framing.
AI_REACT_AGENT_VERSION=v3. In AI_REACT_AGENT_MULTI_ACTION=safe_fanout mode it can accept multiple action requests in the same round by repeating only the <channel:ReactDecisionOutV2>...</channel:ReactDecisionOutV2> block once per action. The intended model contract still emits one thinking block and one code block per response, but the streamer tolerates repeated declared channels and routes them separately. Accepted multi-action bundles are executed sequentially, not in parallel, so an action scheduled later in the same round must not depend on results from an earlier action in that bundle.
v3 uses the same timeline, working-summary, react.read, react.memsearch, and artifact path model.
PlanSnapshot that is tracked as a react.plan block in the timeline. On subsequent rounds the agent can update step statuses (โ done, โ failed, โฆ in-progress). This is all within the same loop โ no coordinator needed, and no cache miss from a different system prompt.
ReAct V3 Agent โ Detailed Loop & Tool Integration
What the Agent Sees Each Round
Every decision round is rendered from durable runtime state into one model-facing context. The shape is intentional: old conversation memory appears first, the active turn stays readable, and ANNOUNCE stays in the uncached tail where operational facts cannot be hidden behind stale cache.
[COMPACTED PRIOR CONVERSATION MEMORY] # if older raw turns were compacted
PRUNED PRIOR TURNS # working summaries or retrieval rows
RECENT INTACT TURNS # newest turns rendered normally
CURRENT TURN # current user input, rounds, tools, files
SOURCES POOL # citation/source inventory
ANNOUNCE # uncached budget, plans, live events, workspace state
The model replies through runtime-owned channels. The primary structured channel is ReactDecisionOutV2, which carries exactly one action per channel instance: call a tool, complete, or exit. In safe fanout mode, the model may emit repeated ReactDecisionOutV2 channel instances in the same response. It must not put multiple JSON decisions into one channel block. Repeated declared channels are tolerated by the streamer: for example, a second thinking block is emitted as another thinking instance. The code channel is parsed as raw executable text, so backticks inside generated HTML, JavaScript, or Python do not hide the </channel:code> boundary.
Creating and Running the ReAct Agent
# In your workflow (BaseWorkflow subclass)
react = await self.build_react(
scratchpad=scratchpad,
tools_module="my_bundle.tools_descriptor",
skills_module="my_bundle.skills_descriptor",
knowledge_space_fn=self._search_knowledge, # optional
knowledge_read_fn=self._read_knowledge, # optional
)
result = await react.run(payload)
Timeline
The timeline (timeline.py) is the single source of truth for turn context. It is persisted as artifact:conv.timeline.v1 and loaded at the start of each turn. A separate artifact:conv:sources_pool tracks all sources referenced in the conversation; the timeline payload also carries the current full source rows so react.read and exec fetch_ctx can recover fetched web content.
It is also a live event surface while a turn is running. Busy-conversation external events such as followup, steer, and the same general family of reactive inputs used by forms, wizards, or alert acknowledgements enter a shared event source for the conversation. Every such event lands on the append-only timeline first. If the event is reactive and the active React turn owns the live listener, the runtime folds it into the current turn and re-enters the loop at the next decision boundary; steer-like controls may also trigger an engineering-layer interrupt that tries to cancel the active generation or cancellable tool phase immediately. The timeline also records round start explicitly, so a reactive event that arrives while the agent is already thinking is shown inside that open round instead of looking as if it happened before the round existed. If a new reactive event lands after a visible completion attempt, the same turn can later append another assistant.completion; the latest completion keeps the stable path ar:<turn_id>.assistant.completion and earlier visible completions use numbered paths such as ar:<turn_id>.assistant.completion.1. External-event message blocks follow the same path pattern, for example ar:<turn_id>.external.followup.<message_id> or ar:<turn_id>.external.alert.<message_id>. The raw timeline remains append-only; restored conversation order is reconstructed later from truthful event start timestamps rather than from append position alone. React then re-enters with the new reactive material already on the timeline and gets a short bounded finalize phase to close the turn cleanly while preserving the progress made so far.
When context compaction actually starts or completes, the runtime emits chat.compaction on the dedicated chat_compaction transport route. Browser clients and adapters such as Telegram can append this as a short progress item while the same ReAct turn continues running.
Cache checkpoints are computed by rounds (tool call rounds + final completion). They allow LLM context caching to skip retokenizing earlier parts of long conversations. See timeline-README.md, source-pool-README.md, and react-announce-README.md.
Pruning, Compaction, and Recovery
TTL pruning and hard compaction do different jobs. TTL pruning keeps the cache useful by replacing older visible blocks with compact recovery rows. Hard compaction is the context-window safety valve: it summarizes an older range into su:<turn_id>.conv.range.summary and removes the compacted raw blocks from the visible stream. Neither process deletes artifacts, tool logs, turn logs, or source rows. The visible text becomes a map; logical paths remain the handles for exact recovery. Compaction lifecycle is visible to clients through chat_compaction stream events.
| Layer | Visible Shape | Recovery Route |
|---|---|---|
| Working summary | ws:<turn_id>.conv.working.summary with goal, outcome, facts, refs | react.read([ws_path]), then read exact refs |
| TTL-pruned turn | compact turn data rows or summary cards, not full old chatter | react.read([ar/tc/fi/so path]) |
| Compacted range | [COMPACTED PRIOR CONVERSATION MEMORY] checkpoint | react.memsearch or paths carried by the summary |
| Exact file | fi: logical file path | react.pull([fi_path]) when code needs a local file |
Multi-Channel Streaming
react.memsearch, and is not shown as user-facing assistant text.
These are runtime-defined channels, not tool-call arguments. ReAct can stream raw code, thinking, decision JSON, and widget/subsystem payloads independently instead of forcing generation into a provider-native tool-calling format. Repeated declared channels are handled as repeated channel instances, and raw code is kept isolated even when generated HTML/JS contains backticks.
This is what enables live-updating widget dashboards while the agent is still running.
When those dashboards are part of a bundle, they remain normal bundle UI surfaces served by KDCube. A client shell may embed them, but iframe embedding is outside the ReAct protocol and outside the bundle surface model.
See channeled-streamer-README.md and streaming-widget-README.md.
Built-In React Tool Surface
The built-in react.* tools are control-plane tools for the loop itself. Bundle tools, MCP tools, web/email tools, and isolated exec tools sit beside them, but these are the primitives the agent uses to manage memory, files, plans, and context size.
Large initial tool results are prompt-capped before the next decision round: the full tc: result remains stored, while the model sees a bounded preview with size metadata, a depth-limited shape, and recovery instructions for react.read or exec ctx_tools.fetch_ctx.
| Tool | Purpose | Typical Use |
|---|---|---|
react.read | Reopen logical paths and exact ranges | Read ar:, tc:, fi:, so:, ws:, su:, sk:, or ks: refs. Large text returns a configured bounded preview by default. For large text files, pass items=[{path,line_start,line_count}] from react.rg to materialize line-numbered ranges. Text previews report fully visible lines as [start-end]/total; a mid-line cut is marked separately. max_text_symbols requests a smaller explicit preview, and stats_only returns metadata without content. PDF/image payloads are attached whole only when under the raw byte cap. |
react.rg | Find local files and text regions | Search materialized artifact files by filename regex and text-like files by content regex. Roots may be files/..., outputs/..., attachments/..., turn_.../..., or fi:.... It does not search unpulled refs or the endless conversation timeline. Returns size_bytes, text_symbols, line_count, logical_path, and ready-to-read read_item ranges for react.read. |
react.memsearch | Search prior conversation memory | Find summaries by topic, ordinal turn, or time window |
react.pull | Materialize historical files | Bring old fi: refs onto the current worker as readonly local reference material. Pull before checkout when a prior turn path must become editable. |
react.checkout | Rebuild an editable workspace | Copy prior files/ paths into the current editable workspace. It is for editing and testing current-turn copies, not for simply reading old outputs or attachments. |
react.write / react.patch | Create or edit current-turn text artifacts | Write Markdown, HTML, JSON, notes, or internal files, and patch current workspace files. Text previews may be line-numbered for reading; those prefixes are display-only and must never be generated in patch or replacement content. channel="internal" creates an internal file by default; add scratchpad=true only for short inline react.note anchors. |
react.plan | Manage open plans | Create, replace, activate, close, and update step state |
react.hide | Shrink visible tail blocks | Replace bulky but recoverable content with a short placeholder |
Visible read limits are unit-specific and apply per requested path: text previews use text-character and token caps; all payloads use a raw byte cap. Unsupported binaries remain metadata-only and should be inspected through exec or related text/source refs.
ReAct V3 Agent Documentation
Full docs live in docs/sdk/agents/react/. Key files:
Architecture & Flow
Timeline & Artifacts
- timeline-README.md โ blocks, cache points, compaction
- context-caching-README.md โ cache checkpoints, TTL pruning, hide
- compaction-README.md โ hard ceiling and compacted memory
- react-announce-README.md โ announce channel
- source-pool-README.md
- artifact-storage-README.md
- artifact-discovery-README.md
- conversation-artifacts-README.md
Tools & Execution
- react-tools-README.md โ built-in
react.*tools - memory-recovery-path-README.md โ memsearch, read, turn index
- external-exec-README.md
- tool-call-blocks-README.md
- event-blocks-README.md
- turn-log-README.md
- turn-data-README.md
Plan Tracking
Plans are a first-class timeline concept, not a separate orchestration layer. The agent creates and manages plans through the react.plan tool, and every plan is persisted as an append-only sequence of react.plan snapshot blocks in the timeline.
PlanSnapshot Structure
Each plan snapshot is stored as a timeline block of type react.plan with a stable plan_id and ordered steps. Key fields:
| Field | Description |
|---|---|
plan_id | Stable identifier for the plan lineage (opaque string) |
steps | Ordered list of step descriptions |
status | Current plan status |
origin_turn_id | Turn where the plan was first created |
last_turn_id | Turn of the most recent update |
closed_ts / superseded_ts | Terminal timestamps (set when plan is closed or replaced) |
The react.plan Tool
The agent manages plans through four lifecycle modes:
mode="new"
Creates a fresh plan lineage with a new plan_id and ordered steps. Becomes the current plan immediately and appears in ANNOUNCE.
mode="replace"
Retires an existing plan (marks it superseded) and creates a new lineage as its replacement. The old plan disappears from the open-plans view.
mode="activate"
Re-activates an older open plan as the current plan. Does not create a new plan_id. Progress acknowledgements apply only to the current plan.
mode="close"
Terminates a plan without replacement. The lineage stays in history but disappears from ANNOUNCE.
Plan Block in Timeline
Plans appear in the timeline as react.plan blocks with a stable reread handle:
# Stable latest-snapshot alias for any plan lineage
ar:plan.latest:<plan_id>
# Model creates a plan
react.plan(mode="new", steps=["collect metrics", "compare trends", "draft answer"])
# ANNOUNCE shows open plans with step markers
# [OPEN PLANS]
# plan_id=plan_alpha (current)
# โก [1] collect metrics
# โก [2] compare trends
# โก [3] draft answer
Step Statuses
The agent reports step progress via notes using status markers. The runtime parses these markers and updates the plan snapshot automatically.
| Marker | Status | Meaning |
|---|---|---|
✓ [n] | Done | Step completed successfully |
✗ [n] | Failed | Step failed or was abandoned |
… [n] | In-progress | Step is currently being worked on |
□ [n] | Pending | Step not yet started (default) |
react.plan(mode="activate"|"replace"|"close"), it should acknowledge progress in a later round, not the same one.
Multi-Round Plan Tracking
Plans survive across rounds and turns through the following mechanisms:
- ANNOUNCE lists the last 4 open plans each round, marking the current one explicitly with
(current). - The stable alias
ar:plan.latest:<plan_id>always resolves to the newest snapshot for a lineage, regardless of which turn last updated it. - On a new turn, the runtime rehydrates only the current open plan automatically. Older plans must be inspected explicitly via
react.readif they become relevant again. - When history is compacted, older plans appear in a
react.plan.historyblock with step skeletons, statuses, and stablesnapshot_refs for recovery.
A plan lineage is considered open only if its latest snapshot is not closed, superseded, or complete. Only the plan tagged (current) in ANNOUNCE may receive step acknowledgements.
See plan-README.md
Isolated Execution Runtime
The platform provides a sandboxed code execution runtime: the agent can generate Python programs, execute them under policy, and receive a normalized result envelope. The runtime has two logical zones. Docker can run those zones in the default split topology with sibling supervisor and executor containers, or in the legacy combined container strategy. Fargate runs the same logical contract in a remote ECS task.
- Supervisor โ networked, has full runtime context, and resolves settings/secrets through the descriptor-backed provider. All bundle tools from
tools_descriptor.pyexecute here, including MCP tools, bundle-local tools, and custom SDK tools. TheChatCommunicatoris also available to tool code, streaming events via Redis Pub/Sub to the client SSE. - Executor โ completely isolated: no network, no descriptor payloads, no provider secret material, separate Linux namespace (UID 1001). Runs LLM-generated code. All tool calls are proxied to the Supervisor over a Unix socket. Can only write to
/workspace/workand/workspace/out.
Two execution backends are practical:
๐ณ Docker Default
Runs code on the same EC2 host as the Processor. Low latency, ideal for interactive agentic loops. Docker supports combined and split container strategies; split gives the executor a separate no-network container with only work, output, logs, and the supervisor socket mounted.
execution:
runtime:
mode: "docker"
enabled: true # default
โ๏ธ AWS Fargate Async only
Serverless container on a separate compute plane. Recommended for long-running, non-live workloads โ batch data processing, heavy computation, report generation โ where startup latency (10โ30s) is acceptable. Not suitable for fast interactive agentic loops.
execution:
runtime:
mode: "fargate"
cluster: "arn:aws:ecs:..."
task_definition: "exec-task"
Executor Environment Variables (Generated Code)
| Variable | Description |
|---|---|
WORKDIR | Working directory (source, helpers) |
OUTPUT_DIR | Output directory (write files here) |
EXECUTION_ID | Unique execution identifier |
AGENT_IO_CONTEXT | Limited tool-proxy context for Unix socket calls |
Supervisor launch env is different from generated-code env. Docker receives the exec launch payload inline as RUNTIME_GLOBALS_JSON. Fargate receives KDCUBE_EXEC_PAYLOAD_SECRET_ID, an AWS Secrets Manager secret name for temporary launch JSON; the entrypoint calls GetSecretValue, parses the JSON, and restores RUNTIME_GLOBALS_JSON, RUNTIME_TOOL_MODULES, and packaged supervisor env before bootstrap. The supervisor also receives descriptor payloads such as KDCUBE_RUNTIME_ASSEMBLY_YAML_B64, KDCUBE_RUNTIME_BUNDLES_YAML_B64, KDCUBE_RUNTIME_GATEWAY_YAML_B64, KDCUBE_RUNTIME_SECRETS_YAML_B64, and KDCUBE_RUNTIME_BUNDLES_SECRETS_YAML_B64. It materializes those descriptors before tool bootstrap so bundle tools can use normal get_settings(), get_plain(), get_secret(...), bundle props, and get_secret("b:...") bundle-secret lookups. By default descriptor payloads are full; setting execution.runtime.descriptor_payload_scope: active_bundle filters only bundles.yaml and bundles.secrets.yaml to the active caller bundle.
Supervisor vs Executor Architecture
The execution runtime uses a strict supervisor/executor boundary. In Docker combined, that boundary is inside one py-code-exec container. In Docker split, the supervisor and executor are sibling containers. In Fargate, the same logical contract runs inside the remote exec task.
- The Supervisor bootstraps the full runtime: loads dynamic tool modules, initializes ModelService, KB client, Redis communication, and starts a
PrivilegedSupervisorlistening on the supervisor socket. - The Executor drops privileges to UID 1001, uses runtime-level network isolation, and runs the LLM-generated
user_code.py. - Every tool call from executor code (
io_tools,web_tools,react_tools, etc.) is proxied over the Unix socket to the supervisor. The executor never has direct access to network, secrets, or databases.
Docker Execution Mode
unshare(CLONE_NEWNET). Docker mode supports custom images, CPU/memory limits, and PID limits via bundle configuration.
For stronger filesystem isolation, configure py_code_exec_container_strategy: "split" so supervisor-only bundle mounts and descriptor material are not mounted into the executor container.
# Docker profile in bundle props
execution:
runtime:
profiles:
docker:
mode: "docker"
image: "py-code-exec:latest"
container_strategy: "split" # default; combined | split
network_mode: "host"
cpus: "1.5"
memory: "2g"
extra_args: ["--pids-limit", "256"]
Fargate Execution Mode
Fargate exec runs the same supervisor/executor architecture as Docker, but on a dedicated ECS Fargate task instead of a local container. This is the replacement for Docker-on-node in environments where Fargate containers cannot access the Docker daemon.
| Aspect | Docker Mode | Fargate Mode |
|---|---|---|
| Startup latency | Sub-second | 10-30 seconds |
| Workdir sharing | Host bind mount | S3 snapshot + restore |
| Network isolation | unshare(CLONE_NEWNET) | Task-level VPC security group |
| Task lifetime | Container exits, docker rm | ECS task STOPPED |
| Caller waits via | proc.communicate() | Poll describe_tasks until STOPPED |
| Best for | Interactive agentic loops | Batch workloads, heavy computation |
The caller (chat-proc) snapshots the workdir and outdir to S3, launches the Fargate task via ecs.run_task, polls until completion, then restores output zips back to the local workspace. From the agent's perspective, the result contract is identical to Docker mode.
Environment Variable Injection
The Fargate task receives supervisor launch state through containerOverrides.environment at run_task time. The current Fargate path stores the exec launch payload in AWS Secrets Manager under a name like kdcube/runtime/exec-payloads/<exec_id> and passes that name as KDCUBE_EXEC_PAYLOAD_SECRET_ID. The task entrypoint reads it with GetSecretValue, restores the runtime env, then proc deletes the temporary secret after the task finishes. Platform and bundle config are shipped separately as descriptor payloads (KDCUBE_RUNTIME_*_YAML_B64), not as raw provider API-key env promotion. Bundle tool module paths are rewritten from host paths to container paths (/workspace/bundles/{bundle_dir}/...).
Network Isolation & Unix Socket Communication
In both Docker and Fargate modes, the executor side is network-isolated. All tool calls from generated code are routed over a Unix domain socket to the supervisor side. In Docker split mode the socket is shared through a small socket volume. The supervisor has full access to Redis, Postgres, ModelService, S3, and external APIs. In Fargate, the supervisor connects to backing services via VPC DNS (Cloud Map private DNS or direct ElastiCache/RDS endpoints).
Error Propagation
Runtime-specific failures (ECS startup failure, Fargate timeout, snapshot restore failure) are surfaced through the same report_text / error envelope as local Docker execution. The agent sees a unified result contract regardless of backend:
# Unified result fields (both Docker and Fargate)
ok: bool # execution succeeded
artifacts: list # produced files
error: str # error message if failed
report_text: str # human-readable summary
user_out_tail: str # last lines of user.log
runtime_err_tail: str # last lines of runtime errors
See distributed-exec-README.md and exec-logging-error-propagation-README.md
Knowledge Space
Bundles can expose a searchable knowledge space built from a Git repository's docs, source code, deployment configs, and tests.
return {
"knowledge": {
"repo": "https://github.com/org/repo.git", # "" = local repo
"ref": "main",
"docs_root": "app/docs",
"src_root": "app/src",
"deploy_root": "app/deploy",
"tests_root": "app/tests",
"validate_refs": True
}
}
on_bundle_load()โ Builds the index once per process (file-locked, signature-cached)pre_run_hook()โ Reconciles if config changed
Agent access via ks: paths: react.search_knowledge(query=..., limit=5) and react.read(["ks:docs/architecture.md"])
Context, RAG & Conversations
Context RAG Client
# self.ctx_client is ContextRAGClient
results = await self.ctx_client.search(
query="previous analysis of sales data",
kind="assistant", # or "user" | "attachment"
limit=5
)
artifact = await self.ctx_client.fetch_ctx(["ar:turn_abc.artifacts.summary"])
Conversations API Endpoints
GET /conversations/{tenant}/{project}
POST /conversations/{tenant}/{project}/fetch
POST /conversations/{tenant}/{project}/{conv_id}/turns-with-feedbacks
POST /conversations/{tenant}/{project}/feedback/conversations-in-period
The react.memsearch tool searches past conversation memory directly inside the agent loop. It has two families: semantic search over indexed snippets and catalog search over Postgres turn-log rows for timeline, ordinal, and temporal questions. That distinction matters: broad questions like "what have we discussed so far?" should use mode="timeline" over targets=["summary"], not a generic semantic query. Questions like "what was the second turn about?" use mode="ordinal". Questions like "what did we discuss in March?" use mode="temporal". The ConversationStore (accessible via BaseWorkflow.store) manages turn payloads, timelines, and artifacts.
Timeline & Context Layout
Each conversation maintains a rolling timeline of turn artifacts stored as artifact:conv.timeline.v1. The timeline is the canonical cross-turn context passed to the LLM. It is structured as an ordered sequence of turn records, each containing user input, assistant output, tool calls, internal notes, working summaries, external-event blocks such as live followup, steer, form events, wizard events, alert events, and any attached artifacts. A single turn may contain multiple prompt-like user entries and multiple visible assistant.completion blocks. The latest assistant completion keeps the stable alias ar:<turn_id>.assistant.completion; earlier visible completions use numbered paths such as ar:<turn_id>.assistant.completion.1. External-event message artifacts use the matching family ar:<turn_id>.external.<kind>.<message_id>. The runtime also surfaces a compact [LIVE TURN EVENTS] area inside ANNOUNCE so the model can orient to same-turn control input without rereading the whole tail.
Cache Points
The platform inserts up to three LLM-level cache checkpoints per turn: prev-turn (the end of the prior turn), pre-tail (just before the current turn's tail), and tail (after the current turn). These cache points allow the LLM inference layer to reuse context prefix KV-cache across turns, reducing both latency and token cost for multi-turn conversations.
Compaction
When the accumulated timeline approaches the configured context budget ceiling, the platform triggers compaction: older turn ranges are summarized into a compact conv.range.summary artifact and replaced in the visible timeline. This is a hard-ceiling guard โ it ensures context never silently overflows the model's context window. Working summaries are injected into the compaction prompt, internal notes can be preserved as stable anchors, and consumed followup / steer controls remain visible through preserved event copies because they are treated as first-class user intent rather than disposable transport noise. Compaction is transparent to bundle code.
Hosting & File Resources
Your bundle can produce files (PDFs, PNGs, data exports) and make them available via hosted URLs. The platform handles upload, serving, and access control automatically.
# ApplicationHostingService (via BaseWorkflow.hosting_service)
url = hosting.get_artifact_url("fi:turn_123.outputs/export/report.pdf")
# Resource Name format
# ef:{tenant}:{project}:chatbot:{stage}:{user_id}:{conv_id}:{turn_id}:{role}:{path}
# Resolved by POST /by-rn with authentication enforced by platform
Files written to OUTPUT_DIR/turn_{id}/files/ remain part of the durable workspace tree, while files written to OUTPUT_DIR/turn_{id}/outputs/ are tracked as non-workspace produced artifacts. User-facing downloads should typically come from outputs/ with external visibility. Original user attachments appear as fi:{turn_id}.user.attachments/{filename}. Later external events can also carry attachment payloads. Those keep message-level identity in a separate logical path family such as fi:{turn_id}.external.followup.attachments/{message_id}/{filename}, fi:{turn_id}.external.form.attachments/{message_id}/{filename}, or more generally fi:{turn_id}.external.<kind>.attachments/{message_id}/{filename}, so repeated filenames from different user messages do not collide while still materializing into the live turn timeline.
Attachments & Limits
User-uploaded files enter the system via the chat API (SSE or Socket.IO), pass through security scanning, are stored in the ConversationStore, and then flow to two downstream paths: multimodal LLM inference and code execution. Original turn attachments follow the normal attachment path. Later busy-turn external events may also carry attachment payloads; those belong to the continuation-event contract, but reactive kinds can still fold them into the active turn timeline under the corresponding external.<kind> path family.
hosted_uri, rn, key, filename, MIME type, message id). If the active React turn owns the live listener, it hydrates readable attachment content from hosting and folds the attachment into the same turn under paths such as fi:<turn_id>.external.followup.attachments/<message_id>/<filename> or, more generally, fi:<turn_id>.external.<kind>.attachments/<message_id>/<filename>. If the event is instead promoted into a later turn, the same hosted attachment payload remains available there too.
User Upload Flow
When a user submits attachments, the ingress layer enforces size caps and runs security preflight before storage:
- Collect raw bytes + metadata (filename, MIME type)
- Enforce per-file and total-message size caps
- Run ClamAV antivirus scan (when
APP_AV_SCAN=1, always enabled in production) - Run preflight validation: MIME-type allowlist via magic sniffing, PDF heuristic checks, ZIP/OOXML structural checks, macro blocking
- If allowed, store via
ConversationStore.put_attachment()
.docm, .pptm, VBA projects) is rejected at ingress. Generic ZIP archives are also disallowed by default.
Supported File Types
| Category | Accepted Types |
|---|---|
| Documents | application/pdf, .docx, .pptx, .xlsx |
| Images | image/jpeg, image/png, image/gif, image/webp |
| Text | text/* (subject to size limit) |
File Rehosting for Execution
For code-generated programs, attachments are materialized into the execution workspace as local files inside the sandboxed container. Original prompt attachments resolve under turn_<id>/attachments/<filename>; busy-turn continuation attachments keep their event-scoped identity under paths such as turn_<id>/external/followup/attachments/<message_id>/<filename> and, more generally, turn_<id>/external/<kind>/attachments/<message_id>/<filename>.
Artifact Size & Count Limits
| Limit | Value |
|---|---|
| Per-image cap | 5 MB (MODALITY_MAX_IMAGE_BYTES) |
| Per-PDF cap | 10 MB (MODALITY_MAX_DOC_BYTES) |
| Total message cap (text + attachments) | 25 MB (MESSAGE_MAX_BYTES) |
| PDF max pages | 500 |
| ZIP max entries | 2,000 |
| ZIP max uncompressed total | 120 MB |
| ZIP max compression ratio | 200x |
| Text file max size | 10 MB |
Timeline Truncation Limits
To prevent context blowup, the platform applies truncation policies to older timeline blocks:
| Limit | Default |
|---|---|
| User/assistant text truncation | 4,000 chars |
| Tool result text truncation | 400 chars |
| Tool result list items cap | 50 items |
| Tool result dict keys cap | 80 keys |
| Base64 in timeline blocks | 4,000 chars (oversized replaced with placeholder) |
| Sources pool base64 cap | 4,000 chars (dropped if exceeded) |
react.read to rehydrate hidden or pruned artifacts when needed. Ranged reads are normal timeline result blocks; after TTL pruning their placeholders preserve the path and line/text-symbol range so the same range can be read again. Skills loaded by react.read are pruned in old turns with a placeholder containing the original sk: reference for re-reading.
Memory Recovery Path
Pruning and compaction are allowed to remove old raw blocks from the visible prompt because the runtime preserves recovery handles. The agent follows a short route instead of rereading everything:
visible exact path
-> react.read([path])
-> react.pull([fi_path]) if execution needs a local file
visible summary path (ws:/su:)
-> react.read([summary_path])
-> react.read(["ar:<turn_id>.react.turn.index"]) if refs are incomplete
-> react.read([ar_or_tc_or_so_path, ...]) or react.pull([fi_path, ...])
topic only
-> react.memsearch(query, targets=["summary", "user", "assistant", "attachment"])
-> read the returned refs or the returned turn_index_path
broad conversation overview
-> react.memsearch(mode="timeline", targets=["summary"], order="asc", top_k=N)
-> summarize returned working summaries in turn order
ordinal clue
-> react.memsearch(mode="ordinal", ordinal=2, targets=["summary", "user", "assistant"])
temporal clue
-> react.memsearch(mode="temporal", from="2026-03-01T00:00:00Z", to="2026-04-01T00:00:00Z", targets=["summary", "user", "assistant"])
The turn index path ar:<turn_id>.react.turn.index is not stored as another timeline block. It is reconstructed on demand from the persisted turn log and artifact metadata, and it lists the turn's summaries, messages, events, tools, artifacts, and sources with short semantic hints.
Citations & Sources
Citation Tokens
The company was founded in 2015 [[S:1]] and expanded by 2020 [[S:2,3]].
According to multiple sources [[S:1-4]], the trend is clear.
Sources Pool Fields
| Field | Description |
|---|---|
sid | Source ID (integer, per-conversation, deduplicated) |
title | Page or file title |
url | URL or file path |
source_type | web | file | attachment | manual |
objective_relevance | Semantic relevance score (0โ1) |
published_time_iso | Publication timestamp |
favicon_url | Source favicon for UI display |
Feedback System
POST /conversations/{tenant}/{project}/{conv_id}/turns/{turn_id}/feedback
{ "reaction": "ok", "text": "Very helpful!", "ts": "2026-03-21T10:00:00Z" }
# reaction: ok | not_ok | neutral | null
Your bundle can also emit machine feedback (origin: "machine") for confidence scores or quality checks โ additive, not replacing user feedback. Satisfaction rate: ok / (ok + not_ok + neutral).