SDK Single-Agent Loop Timeline-Native Multi-Channel Streaming

KDCube ReAct v2 Agent

A timeline-native, single-agent decision loop designed for precision reasoning, structured multi-channel output, and long-lived conversational memory — built for the demands of production AI applications.

Timeline as single source of truth Contribute vs announce model Multi-channel streamer Cross-conversation source pool react.* in-loop tools

What is KDCube ReAct v2?

ℹ️

Naming note: The runtime is referred to as "ReAct v2", "React v2", and "ReAct Agent" across different parts of the codebase and documentation. These names refer to the same system; this page uses ReAct v2 as the canonical form.

ReAct v2 is KDCube's production decision-loop agent. It implements a single-agent Reason + Act loop running over a shared conversation timeline, where every tool call, artifact, plan update, and final answer is recorded as an ordered block in that timeline. The timeline acts as the single source of truth for all in-turn context, history, and compaction.

Unlike multi-agent pipelines that pass context through separate message buses, ReAct v2 keeps all state in one timeline artifact (conv.timeline.v1) and a companion source pool (conv:sources_pool). This design makes every turn reproducible, debuggable, and efficiently cacheable.

Core philosophy

🔁

Single-agent loop

One ReAct agent drives the full decision loop per turn. No separate coordinator or final-answer generator is required in the reference implementation. A lightweight optional Gate agent runs only for new conversations to extract a title.

📋

Timeline-first design

The timeline is built from contribute (persistent) and announce (ephemeral) blocks. Agents read a rendered snapshot at every round — not a raw message array — giving precise control over what is cached and what is fresh.

🔒

Evidence over inference

Tool results, artifact paths, and source citations are recorded as typed blocks with stable logical paths. The agent always has a verifiable audit trail for every action taken.

📡

Multi-channel output

A tag-based streaming protocol lets the model emit multiple logical outputs — thinking, answer, JSON sidecars, follow-up suggestions — in a single LLM call, each routed to the correct consumer.

Why it matters for production AI

Production AI apps face three recurring problems: context drift (what did we discuss three turns ago?), output fragmentation (how do I show reasoning and a structured result at the same time?), and cost at scale (how do I avoid re-feeding the whole conversation on every round?). ReAct v2 addresses all three directly through its timeline design, multi-channel streamer, and dual-checkpoint caching system.

Timeline & Side Effects Model

The timeline is the authoritative, ordered log of everything that happened in a conversation. It is loaded at turn start, updated as the turn progresses, and persisted at turn end as a single JSON artifact.

Contribute vs Announce

Contribute (persistent)

Blocks written via ctx_browser.contribute() are saved into the timeline and appear in all future renders. Used for:

  • User prompts and attachments
  • Gate and ReAct stage outputs
  • Tool call / tool result blocks (react.tool.call, react.tool.result)
  • Plan snapshots (react.plan) and acknowledgements (react.plan.ack)
  • Final assistant completion (assistant.completion)

Announce (ephemeral)

Appended to the rendered tail when include_announce=True — never persisted as part of the main timeline. Used for:

  • Current iteration count and remaining budget
  • Authoritative temporal context (UTC, user timezone)
  • Active plan status with step markers (✓ / ✗ / □)
  • System notices (e.g., cache TTL pruning alerts)
  • Feedback updates (shown until incorporated)
💡

The announce model keeps high-frequency state signals out of the cached timeline. This preserves cache hits across rounds while still giving the agent fresh situational awareness on every decision call.

Timeline block stream with side effects
TURN turn_A (persisted contribute blocks) user.prompt tool.call tool.result react.plan assistant .completion → sources_pool update → turn.log persist ANNOUNCE (ephemeral — appended at tail, never persisted) iteration / budget / active plan / temporal context

Turn lifecycle

  1. 1

    Load timeline

    ctx_browser.load_timeline() fetches artifact:conv.timeline.v1 and artifact:conv:sources_pool, hydrating in-memory state for the turn.

  2. 2

    Contribute user input

    User prompt and any attachments are contributed as persistent blocks, joining the timeline block stream.

  3. 3

    Gate agent (optional)

    Only for new conversations. The gate agent renders the timeline without sources or announce, emits a title block, and contributes it. All subsequent turns skip the gate entirely.

  4. 4

    ReAct decision loop

    Each round: render timeline (with sources + announce), call the LLM, execute the chosen tool, contribute results back into the timeline. Repeat until the agent emits a final answer or the iteration budget is exhausted.

  5. 5

    Persist

    Both conv.timeline.v1 and conv:sources_pool are written back to storage. The turn log (artifact:turn.log) records the current-turn blocks for fast next-turn reconstruction.

Compaction and TTL pruning

When the rendered timeline would exceed the model's context budget, the runtime compacts earlier blocks into a single conv.range.summary block. The compacted blocks are removed from the persisted payload; future renders start from the summary forward.

Separately, when a session cache_ttl_seconds is configured, blocks older than the TTL are replaced with truncated placeholders on render. A system.message block is appended advising the agent to use react.read(path) to restore any specific logical path.

Block ordering (schematic)

# Typical timeline block sequence (condensed)
[TURN turn_A]
  user.prompt
  user.attachment.meta
  stage.gate                  # optional, new conv only
  stage.react
  react.tool.call             # tool params
  react.tool.result           # rendered artifact metadata
  react.plan                  # JSON snapshot of plan
  react.plan.ack              # human-readable ack
  assistant.completion        # [sources_used: [1,2,3]]

[TURN turn_B]
  user.prompt
  ...
[SOURCES POOL]                # appended at tail (uncached)
[ANNOUNCE]                    # appended at tail (uncached)

Custom Tool-Calling Architecture (react.*)

ReAct Agent V2 — Loop & Tool Integration (detailed)
ReAct Agent V2 — Detailed Loop & Tool Integration autonomous loop — up to max_iterations rounds User Input user.prompt + attachments Context Assembly render_timeline() + sources + announce + 3 cache checkpoints LLM Decision react_decision_stream_v2 structured JSON output: action + tool + params + notes Output Channels thinking / answer followup / canvas usage sidecar → Client streamed fanout tool call Tool-Call Validation Gate protocol order check · path namespace validation react.notice on error → agent sees next round react.* tool external tool In-loop react.* Tools react.read · react.write react.pull · react.patch react.hide · react.memsearch operate on timeline & data spaces External Tool Execution web_tools · code_exec mcp_tools · skill tools subprocess / network result returned to agent Contribute to Timeline ctx_browser.contribute() tool.call + tool.result blocks round contribution → timeline state Budget & Iteration Check rounds < max_iterations? token budget remaining? yes → next round no → final answer next round Persist: conv.timeline.v1 + conv:sources_pool + turn.log + artifacts → turn end

Standard LLM tool calling routes tool invocations to external executors. ReAct v2 supplements this with a set of in-loop react.* tools that operate directly on the timeline and the agent's data spaces — without leaving the decision loop.

ℹ️

Current write contract: The authoritative in-loop file authoring tool is react.write. It writes current-turn text artifacts into files/<scope>/... for durable workspace state or outputs/<scope>/... for non-workspace produced artifacts. Historical refs can be materialized locally with react.pull(paths=[...]) as readonly reference material. The active current workspace under turn_<current_turn>/files/... is defined explicitly with react.checkout(...).

Data spaces

ks: Knowledge Space

Read-only. Reference files prepared by the system — docs, indexes, cloned repos. Accessed via react.read("ks:path/to/doc.md").

fi: Versioned turn artifacts

Logical snapshot namespace. Used for historical workspace files, non-workspace outputs, and attachments such as fi:<turn_id>.files/<scope>/..., fi:<turn_id>.outputs/<scope>/..., and fi:<turn_id>.user.attachments/<name>.

Current turn workspace

Active writable surface. The agent writes durable project state to turn_<current_turn>/files/... and produced artifacts to turn_<current_turn>/outputs/.... In git mode this turn root is also a sparse local git repo.

💡

Workspace activation is explicit. ReAct should inspect ANNOUNCE first, then use react.checkout(mode="replace", paths=["fi:<turn>.files/<scope>"]) to seed or continue the active current-turn workspace. Use react.checkout(mode="overlay", paths=[...]) to import selected historical files into the existing workspace. Use react.pull(paths=["fi:..."]) when older versions should be available locally as readonly side views only. In git mode, ANNOUNCE also exposes ls workspace so the agent can choose which existing scope to continue.

Tool catalog

Tool Purpose Key behavior Path family
react.read Load an existing artifact into timeline context Emits a status block first (dedup check), then artifact blocks. Re-exposes hidden artifacts. Clears hidden flag. fi: ar: so: su: tc: ks:
react.write Write current-turn text content and optionally emit a user-visible file Writes to files/<scope>/... for durable workspace state or outputs/<scope>/... for non-workspace artifacts. Supports channel=canvas|timeline_text|internal and kind=display|file. Current-turn paths under <turn_id>/files/ or <turn_id>/outputs/
react.pull Explicitly materialize historical workspace files or outputs locally as readonly reference material Accepts fi: refs only. Supports subtree pulls for fi:<turn_id>.files/.... Outputs and attachments must be exact file refs. Pulled refs land under their historical turn roots and do not modify the active current-turn workspace. Historical files are not auto-hydrated for exec or patching. fi:<turn_id>.files/..., fi:<turn_id>.outputs/..., exact attachment refs
react.patch Patch an existing current-turn file; supports unified diff If patch starts with ---/+++/@@ → unified diff; otherwise full replacement. Historical cross-turn patching requires the file to be pulled first. Current-turn paths under <turn_id>/files/ or <turn_id>/outputs/
react.checkout Construct or update the active current-turn workspace from historical refs Accepts ordered paths=[fi:<turn>.files/...] with mode="replace"|"overlay". replace seeds the current workspace from scratch. overlay imports selected historical files into the existing workspace and overwrites overlaps without deleting unspecified files. Legacy version="turn_..." remains as compatibility for whole-tree checkout. paths=[fi:<turn>.files/...], optional mode
react.memsearch Semantic search over past turns Returns compact snippets with turn_id, timestamps, relevance scores. Targets: assistant / user / attachment. Conversation index (no path argument)
react.hide Replace a large in-context snippet with a placeholder Original content remains retrievable via react.read. Only blocks within the editable tail window can be hidden. ar: fi: tc: so: ks:
react.search_files Safely enumerate files under OUT_DIR or workdir Returns paths + sizes + logical paths. Does not load content. Use react.read on results to load. outdir / workdir prefixes only
react.search_knowledge Search knowledge space (bundle-provided) Available only when the active bundle registers the tool (e.g., react.doc bundle). Returns ranked doc hits. ks: namespaces

Protocol validation

Every tool call follows a strict ordering contract: parameters must be supplied in the documented field order (first field, second field, etc.). If the agent emits notes with a decision, a react.notes block is contributed before the tool call. Protocol errors and validation notices are emitted as react.notice blocks, visible to the agent on the next round.

Artifact paths use stable logical namespaces (fi:, ar:, tc:, so:). Physical execution paths returned by bundle namespace resolvers are valid only inside that exec runtime and are explicitly not valid as inputs to react.* tools.

Multi-Channel Streaming

ReAct v2's output layer is built on the Channeled Streamer (versatile_streamer.py), a tag-based protocol that routes a single LLM stream into multiple named logical channels, each with independent format, citation replacement, and subscriber fanout.

How it works

The model wraps each logical output in XML-like channel tags:

# Model output (single LLM call, multiple channels)
<channel:thinking>
  Let me check the document structure first...
</channel:thinking>

<channel:answer>
  Based on the report [[S:1]], the key findings are...
</channel:answer>

<channel:followup>
  {"followups": ["Show me the full table", "Compare with last quarter"]}
</channel:followup>

The stream_with_channels() function parses these tags incrementally, routing each chunk to the correct channel handler as it streams. Citation tokens ([[S:n]]) are replaced per-channel without modifying the stored raw output.

Channel types in ReAct v2

thinking

Thinking

Internal reasoning trace. Routed to marker="thinking". Typically markdown format. Can be shown in a dedicated reasoning UI panel or suppressed from the end user.

answer

Answer

User-visible main response. Routed to marker="answer". Supports markdown with live citation replacement ([[S:n]] → linked references).

followup

Follow-up

JSON list of suggested follow-up prompts. Routed to a chat.followups step. Parsed incrementally for live UI rendering.

canvas

Canvas / JSON artifact

When composite streaming is enabled, JSON content is parsed field-by-field and each attribute delta is streamed to the canvas marker for live structured UI updates.

usage

Usage sidecar

JSON sidecar with token counts, model metadata, and source IDs used in the answer. Enables client-side analytics without touching the main answer stream.

ReactDecisionOut / code

Structured decision output

Structured JSON channels carrying the agent's next action (tool call + params + reasoning). Parsed against a Pydantic model for protocol validation.

Single-round multi-channel streaming flow
LLM stream Channel Parser versatile_streamer thinking answer canvas usage Client reasoning panel answer stream canvas/JSON analytics sidecar

Listener attachment and subscriber fanout

Each channel can carry one or more subscribers — side-effect handlers that run alongside the primary emit function. A subscriber on the usage channel, for example, can write source IDs to a timeline record streamer while the answer channel continues streaming to the user uninterrupted.

# Illustrative adaptation (from docs pattern)
from kdcube_ai_app.apps.chat.sdk.streaming.versatile_streamer import (
    ChannelSpec, ChannelSubscribers, stream_with_channels
)

channels = [
    ChannelSpec(name="thinking",  format="markdown", replace_citations=False, emit_marker="thinking"),
    ChannelSpec(name="answer",    format="markdown", replace_citations=True,  emit_marker="answer"),
    ChannelSpec(name="followup",  format="json",     replace_citations=False, emit_marker="followup"),
    ChannelSpec(name="usage",     format="json",     replace_citations=False, emit_marker="answer"),
]

results, meta = await stream_with_channels(
    svc=svc,
    messages=[system_msg, user_msg],
    role="answer.generator.regular",
    channels=channels,
    emit=_emit_wrapper,
    agent="my.react.agent",
    artifact_name="response",
    sources_list=sources_list,
    subscribers=ChannelSubscribers().subscribe("usage", _record_usage_sids),
    return_full_raw=True,
)

answer_text = results["answer"].raw
used_sids   = results["answer"].used_sources
service_err = (meta or {}).get("service_error")
Subscriber / listener attachment pattern
stream_with _channels() Primary emit all channels → client Subscriber "usage" → record_sids Timeline record sources_used + meta Subscriber runs alongside emit — does not interrupt the answer stream

Citation replacement and token savings

Citation tokens ([[S:1]], [[S:1,3]], [[S:2-4]]) are replaced at stream time for markdown and text channels, and replaced with <sup class="cite"> tags for html channels. The stored raw output is never modified; replacement happens only in the bytes sent to the client.

A stateful per-channel citation tokenizer handles tokens split across chunk boundaries, ensuring [[S:n]] is never partially emitted to the client. This allows citation-heavy responses to stream cleanly without client-side parsing.

Composite JSON streaming

For managed JSON artifacts, the CompositeJsonArtifactStreamer can be attached to a JSON channel. As the model streams a JSON object, each top-level field is parsed incrementally and emitted as a per-attribute delta to a separate marker (typically canvas). This enables live structured UI rendering before the full JSON closes.

Source Management

The sources pool is a per-conversation registry of canonical source rows accumulated across the entire conversation lifecycle — not just the current turn.

What goes into the pool

ℹ️

Non-image file types (XLSX, PPTX, DOCX, PDFs, archives) are not added to the sources pool. Only images are eligible as attachment/file sources.

Stable SID numbering and deduplication

Sources are merged by normalized URL (or physical_path for local files). Once a source is assigned a sequential ID (sid), that ID is stable for the entire conversation. Duplicate URLs reuse the existing SID; only genuinely new sources receive the next integer.

This means [[S:3]] in turn 2 refers to the same source as [[S:3]] in turn 12. Clients can rehydrate citations by matching sources_used SIDs in any artifact against the current sources_pool in timeline.json.

Storage and rendering

The full pool is stored as artifact:conv:sources_pool. A compact snapshot (sid, title, url, short text, and limited metadata) is embedded in the timeline artifact for fast local access. The compact snapshot is rendered as a single SOURCES POOL tail block when the timeline is rendered with include_sources=True.

Accessing and citing sources

In the decision loop

# Load specific source rows from docs pattern
react.read(["so:sources_pool[1-5]"])
react.read(["so:sources_pool[1,3,7]"])

In generated text

# Inline citation tokens (docs pattern)
[[S:1]]       # single source
[[S:1,3]]     # multiple sources
[[S:2-4]]     # range of sources
# Note: only web sources (http/https) should be cited as evidence.
# Image sources are for rendering only, not evidence citations.
Cross-conversation source pool with stable SIDs
Turn 1 web_search → 3 URLs → SID 1, 2, 3 assigned Turn 5 web_fetch → 2 new URLs → SID 4, 5 assigned Turn 12 web_search → URL 2 again → SID 2 reused (dedup) conv:sources_pool (persisted, cross-turn) SID 1 | SID 2 | SID 3 | SID 4 | SID 5 — stable for entire conversation Citations [[S:n]] resolve identically in turn 1 and turn 15

Cross-document citation continuity

Because SIDs are stable across turns, a source indexed as [[S:5]] from a web fetch in turn 1 remains [[S:5]] in the final answer in turn 15. The streamer resolves citations from the current pool state at stream time, so the client always receives correct hyperlinks regardless of when the source was first added.

Memory & Context Management

ReAct v2 approaches memory not as a separate subsystem but as a set of layered strategies on the timeline: caching, compaction, TTL pruning, semantic search, and selective hiding all operate on the same block stream.

Three-checkpoint caching

The context browser inserts up to three cache checkpoints per rendered timeline, enabling providers that support prompt caching (such as Anthropic) to serve most of the context from cache on repeated calls:

Sources and announce blocks are appended after all checkpoints and remain uncached every round, ensuring the agent always sees fresh source data and plan status without invalidating the stable prefix cache.

Compaction (edit-over-rewrite framing)

When the visible timeline exceeds the model budget, earlier blocks are compacted into a conv.range.summary block at the cut point. The original blocks are removed from the persisted payload; future renders start from the summary forward. This is an edit-over-rewrite approach — the timeline is surgically trimmed, not discarded and restarted.

TTL-based context expiry efficiency

When cache_ttl_seconds is set in the session configuration, blocks from previous turns are replaced with compact truncated placeholders after the TTL expires. A system.message block is appended to explain the pruning and guide the agent to restore specific paths via react.read(path). This mechanism keeps the active context window efficient for long-lived sessions without losing the ability to recall prior content.

react.memsearch — Persistent recall

Even after compaction or TTL pruning, prior turn content is indexed semantically. react.memsearch allows the agent to query that index for relevant snippets from any historical turn, surfacing them with scores and timestamps. The agent can then selectively reload full content via react.read.

# Docs pattern: semantic recall from past turns
react.memsearch(
    query="database schema discussed last week",
    targets=["assistant", "user"],
    top_k=5,
    days=30
)
# Returns: [{turn_id, text, score, ts}, ...]

react.hide — Selective context pruning

Within a single turn, the agent can replace large blocks in the editable tail window with short placeholders using react.hide. The original content remains accessible via react.read. This is useful for hiding large file artifacts that are no longer needed in the active context window, freeing space for new tool results.

⚠️

react.hide is constrained to the editable tail window — blocks before the pre-tail cache checkpoint cannot be hidden. This prevents cache invalidation from inadvertent edits to stable prefixes.

Plan snapshots

Plans are tracked as explicit react.plan JSON snapshot blocks in the timeline, updated each time the decision agent acknowledges progress in its reasoning notes. A human-readable react.plan.ack block is emitted alongside. The active plan is surfaced in the announce section each round; closed or completed plans remain in the timeline history but are no longer re-announced.

# Plan snapshot block (docs pattern, simplified)
{
  "type": "react.plan",
  "mime": "application/json",
  "path": "ar:<turn_id>.react.plan.<plan_id>",
  "text": {
    "plan_id": "plan_abc123",
    "origin_turn_id": "turn_001",
    "steps": [
      { "n": 1, "label": "Gather sources",  "status": "done" },
      { "n": 2, "label": "Draft report",    "status": "pending" }
    ]
  }
}

Code Examples

ℹ️

Examples below are labeled either FROM DOCS PATTERN (directly reproduced or lightly formatted from SDK documentation) or ILLUSTRATIVE ADAPTATION (derived structure; not a verbatim API contract).

1 — Channel definition and listener setup

FROM DOCS PATTERN
from kdcube_ai_app.apps.chat.sdk.streaming.versatile_streamer import (
    ChannelSpec, ChannelSubscribers, stream_with_channels
)

# Define channels emitted by the model
channels = [
    ChannelSpec(
        name="answer",
        format="markdown",
        replace_citations=True,
        emit_marker="answer",
    ),
    ChannelSpec(
        name="usage",
        format="json",
        replace_citations=False,
        emit_marker="answer",
    ),
]

# Attach a subscriber to the "usage" channel for side-effect fanout
subscribers = ChannelSubscribers().subscribe("usage", _usage_json_fanout)

# Invoke streaming (docs pattern; see versatile_streamer.py)
results, meta = await stream_with_channels(
    svc=svc,
    messages=[system_msg, user_msg],
    role="answer.generator.regular",
    channels=channels,
    emit=_emit_wrapper,
    agent="my.agent",
    artifact_name="report",
    sources_list=sources_list,
    subscribers=subscribers,
    return_full_raw=True,
)

answer_raw   = results["answer"].raw
used_sources = results["answer"].used_sources
service_err  = (meta or {}).get("service_error")

2 — Multi-channel streaming round with composite JSON

FROM DOCS PATTERN
from kdcube_ai_app.apps.chat.sdk.streaming.versatile_streamer import ChannelSpec, stream_with_channels
from kdcube_ai_app.apps.chat.sdk.streaming.artifacts_channeled_streaming import CompositeJsonArtifactStreamer

# Answer channel carries JSON; composite streamer fans it out to canvas
channels = [
    ChannelSpec(name="answer", format="json", replace_citations=False, emit_marker="answer"),
    ChannelSpec(name="usage",  format="json", replace_citations=False, emit_marker="answer"),
]

results = await stream_with_channels(
    svc=svc,
    messages=[system_msg, user_msg],
    role="answer.generator.regular",
    channels=channels,
    emit=emit_delta,
    agent="my.agent",
    artifact_name="my.json.artifact",
    composite_cfg={"artifactA": "path.to.schema"},
    composite_channel="answer",   # route JSON channel into composite streamer
    composite_marker="canvas",    # emit per-attribute deltas to canvas
)

3 — Source referencing across documents

ILLUSTRATIVE ADAPTATION
# Step 1: Sources are accumulated across turns into conv:sources_pool.
# The pool is loaded at turn start via ctx_browser.load_timeline().

# Step 2: Decision agent loads specific sources by SID range (docs pattern)
# Inside the react decision loop:
react.read(["so:sources_pool[1-5]"])   # loads sources 1 through 5
react.read(["so:sources_pool[1,3,7]"]) # loads sources 1, 3, 7

# Step 3: Agent cites sources inline in generated text
# The streamer replaces [[S:n]] tokens during streaming (never in stored raw)
answer_text = """
According to the analysis [[S:1]], the platform supports
multi-channel streaming [[S:2,3]]. Cache efficiency results
in up to 60% cost reduction [[S:1]].
"""

# Step 4: Client rehydrates citations using sources_used SIDs from the artifact
# matched against current sources_pool in timeline.json
used_sids = results["answer"].used_sources  # e.g. [1, 2, 3]
pool      = timeline_json["sources_pool"]
refs      = [s for s in pool if s["sid"] in used_sids]

4 — react.memsearch for persistent recall

FROM DOCS PATTERN
# Semantic search over past turns — useful after compaction or TTL pruning
# Inside the react decision loop (react.memsearch is a react.* in-loop tool):

react.memsearch(
    query="authentication configuration from last session",
    targets=["assistant", "user"],
    top_k=5,
    days=90
)

# Returns blocks like:
# [{"turn_id": "turn_123", "text": "...", "score": 0.84, "ts": "2026-02-01T...Z"}]

# Then restore the full content if needed:
react.read(["fi:turn_123.files/config.yaml"])

What Makes ReAct v2 Architecturally Different

⚠️

Conceptual comparison only. KDCube does not publish direct benchmarks or first-party technical comparisons with third-party systems. The table below focuses on the design properties that make this runtime distinct: timeline-first context, the attention area, cache-aware context shaping, layered memory, and explicit workspace control. These are architectural contrasts, not benchmark claims.

Architectural property KDCube ReAct v2 Common general-purpose agentic pattern
Primary context model Timeline-first. The agent reads one ordered block stream that can contain user prompts, tool activity, plans, source updates, runtime notices, feedback, and other events on the same temporal surface. Usually transcript-first. The dominant abstraction is a message array plus assistant/tool-call/tool-result turns, with other runtime facts injected ad hoc.
Shared event landscape The runtime explicitly models that not all important events are caused by the agent. Steer, feedback, pruning notices, source-pool updates, workspace publish notices, and service alerts can all appear beside tool-caused events. Usually optimized for assistant-caused tool use. External or runtime-originated events tend to be bolted on as extra prompt text or ignored by the core protocol.
Attention area / signal board A fixed tail surface keeps SOURCES POOL and ANNOUNCE directly in front of the model. High-priority runtime signals, budget, time, plan state, and workspace status always appear in the same place. No formal attention-area concept. Current state is usually scattered across system prompts, hidden middleware state, or freshly rebuilt messages each round.
Streaming contract Channeled generation. One model call can drive thinking, structured decision output, and code as separate streams that widgets and runtime components consume live. Usually one assistant stream, or multiple independent calls stitched together after the fact. Tool intent and code payloads often compete for the same output channel.
Cache-aware context shaping Cached system message + three moving timeline checkpoints. Stable prefix stays reusable while the active tail remains editable. react.hide, TTL pruning, and compaction all work within that design. Depends on provider and framework. Many systems rebuild the full context each round and treat hiding/pruning/summarization as external maintenance rather than part of the runtime contract.
Memory architecture Memory is layered: timeline, attention area, turn log, workspace, hosted artifacts, indexed conversation memory, compaction summaries, hidden replacement text, plans, notes, and feedback all play distinct roles and are reopened through logical paths. Often presented as one generic memory layer or one RAG step. Different memory roles usually require custom app logic rather than being first-class in the runtime.
Workspace control Workspace state is explicit. react.pull(...) brings historical refs locally as readonly material; react.checkout(mode="replace"|"overlay", ...) defines what is materialized into the active current-turn workspace. files/... and outputs/... are separate namespaces. Both custom and git backends are supported. Often the local filesystem is treated as the implicit truth. Historical versions, active workspace state, and produced artifacts are commonly mixed together or left to app-specific conventions.
Working memory and decision trace react.plan, react.plan.ack, react.notes, and internal note lines tagged with [P], [D], and [S] let the runtime persist preferences, rationale, and technical structure as part of the conversation memory model. Usually left in transient prompt text or app-specific side stores. The runtime rarely gives these forms stable paths and a clear relationship to the main context artifact.
Runtime-owned enforcement and feedback Tool protocol is owned by the SDK runtime. Validation failures become visible notices inside the loop, so the agent can correct itself against runtime rules rather than only provider schemas. Validation is often pushed down to provider JSON schema or external middleware. Runtime feedback inside the next model round is commonly absent or inconsistent.
Execution model Designed for distributed isolated execution. Generated code can run in controlled local or remote sandboxes while still seeing logical paths, workspace rules, and runtime-owned tool proxies. Frequently tied to a single host process, a personal local agent model, or a thin wrapper around provider-native tool calling with less explicit isolation semantics.
Open source / self-hostable Available via KDCube open-source AI app repository. Deployable to Kubernetes, local, or cloud infrastructure. Varies. Some frameworks are open source; many polished hosted agent experiences are not self-hostable.
💡

ReAct v2 is designed as a production-first SDK. The distinctive part is not only that it can call tools. It is that the runtime owns the event landscape, the attention area, cache behavior, memory surfaces, and workspace semantics around those tools.