Agent Loops - Orion Architecture

This page covers what actually happens between “user sends a message” and “agent finishes.” The Two Engines page covered routing and parity at a high level. This page traces the iteration loop step-by-step in each engine and shows where every other system (hooks, subagents, skills, system prompts) plugs in.

The two engines run the iteration loop in fundamentally different process models. Claude SDK spawns cli.js as a subprocess and the sidecar observes it via an AsyncIterable<SDKMessage>. Pi runs AgentSession in-process and the sidecar observes via session.subscribe(). Same agent semantics, very different code.

Loop sequence — both engines side-by-side

The core loop in code

Claude SDK
Pi

The loop body in handler.mjs:1188-1294:

for await (const message of q) {
  // CRITICAL: intercept terminal-error result BEFORE processSdkMessage emits it
  if (message.type === "result" && message.subtype !== "success") {
    const err = new Error(...);
    err.sdkSubtype = message.subtype;
    throw err;  // outer catch can stale-session-retry
  }
  processSdkMessage(id, message, startTime, trace, traceId, {
    activeToolSpans, langfuseEnabled, startObservation,
    contextTokenTracker, responseTextAccumulator, apiTurnTracker,
    thinkingAccumulator, model, queryContext, sessionTracker,
    memoryInvoke, subagentWriter,
  });
  if (message.type === "result" && message.subtype === "success") {
    resultData = { sessionId, costUsd, inputTokens, ..., stopReason };
  }
}

The SDK subprocess (cli.js) is the actual agent loop. It calls Anthropic’s /messages endpoint, receives tool_use blocks, executes tools (sandboxed Bash, MCP, etc.), submits tool_results, and repeats until stop_reason !== 'tool_use'. The sidecar’s iterateQuery is purely a translator/observer.Per-message dispatch (message-processor.mjs:986-1086) handles:

assistant → handleAssistantMessage (tool_use blocks → emit toolStart)
stream_event → text/thinking deltas, usage tracking, api-turn spans
user → tool_result blocks → emit toolComplete
result → terminal complete IPC with stop_reason
system → init, compact_boundary, task_started, hook_started, etc.

The loop body in pi-session.mjs:1002-1005:

session.subscribe(event => translate to IPC); // emits text, thinking, toolStart, toolComplete, modelChange
await session.prompt(prompt);                  // hands prompt to Pi SDK
const endEvent = await completionPromise;      // waits for agent_end

The actual iteration is inside session (Pi SDK’s AgentSession). The sidecar observes via subscribe(). Pi event types and what they emit:

Pi event	Sidecar reaction
`message_update` w/ `text_delta`	accumulate + emit `text` IPC
`message_update` w/ `thinking_delta`	accumulate + emit `thinking` IPC
`tool_execution_start`	generate stable `toolId`, emit `toolStart`
`tool_execution_end`	emit `toolComplete`; persist to `task_run_messages`
`compaction_start`/`end`	synthetic `__pi_compaction__` tool card
`auto_retry_start`/`end`	synthetic `__pi_retry__` tool card
`agent_end`	resolve `completionPromise` — exit loop

session.getSessionStats() polled at end for usage/cost. Delta-subtracted if it’s a reused (cached) session.

Hooks within the loop

Every hook event has an insertion point in the loop. Here’s the full table:

The Pi hook adapter

Pi can’t natively consume Claude’s hook config. The bridge is pi-hook-adapter.mjs::createOrionHookExtension — an Extension factory that gets wired in via DefaultResourceLoader.extensionFactories. It listens to Pi’s native events (pi.on('tool_call'), pi.on('before_agent_start'), etc.) and translates them onto the 7 Claude-style hook events. The SAME prompts/settings.json config feeds both engines. Pi-only hooks (no Claude analog):

pi.on('context') → pruneOldToolResults(messages, keepTurns=5) — runs on every context refresh
Tool middleware via applyMiddleware(tools, [env-isolation, project-detect, lifecycle]) — wraps tool.execute (different from PreToolUse which fires at the SDK layer)

See Hooks for the full hook execution model.

Subagents within the loop

Subagent dispatch is parent-blocking on both engines — the parent agent cannot make its next LLM call until the child returns.

Claude SDK
Pi

Native Task/Agent tool registered by the SDK preset. The SDK subprocess handles spawn entirely; the sidecar observes:

assistant.tool_use w/ name=Task → handleAssistantMessage (message-processor.mjs:191-206) detects via isTaskTool(block.name) and emits subagentStart IPC
system w/ subtype: 'task_started' → emits taskStarted IPC with subagent metadata
system w/ subtype: 'task_progress' (background tasks) → emits taskProgress IPC
parent_tool_use_id set natively on every assistant/user message during child run — sidecar forwards on toolStart/toolComplete
user.tool_result w/ tool_use_id === Task block.id → emits subagentComplete
subagentWriter (SubagentJsonlWriter at handler.mjs:2168-2170) persists every event with non-null parentToolUseId to .orion/subagent-events/{conversationId}/events.jsonl

spawn_subagent custom Pi tool with 3 modes: single, chain, parallel. From pi-tools.mjs:370-505:

single: await runSubagent(agentName, prompt, {...sharedOpts, toolId: subagentToolId})
chain: sequentially await runSubagent() for each step, template-substituting {previous} and {chain_dir} between
parallel: bounded concurrency 4 via pi-parallel-executor.mjs

runSubagent() (pi-subagent-runner.mjs:922-1166):

Resolve agent via getAgentDefForSubagent(agentName, agentsDir) — D9 filter (AGENTS.md only, no SOUL/MEMORY)
Archive check via isAgentArchived()
Depth guard via checkSubagentDepth(callId, maxDepth) — defaults DEFAULT_MAX_SUBAGENT_DEPTH
Model fallback loop: buildModelCandidates(primaryModel, agentDef.fallbackModels)
Emit subagentStart → incrementDepth(callId)
Try each candidate: await runSingleAttempt({modelStr, task, agentDef, options})
finally: decrementDepth(callId) + emit subagentComplete

runSingleAttempt() (L324-680):

Creates a fresh in-process AgentSession with SessionManager.inMemory(), SettingsManager.inMemory({compaction:{enabled:false}}), DefaultResourceLoader({noExtensions:true, noPromptTemplates:true})
applySystemPromptOverride(session, effectiveSystemPrompt) — triple-patch per OpenClaw pattern (agent.state.systemPrompt + _baseSystemPrompt + _rebuildSystemPrompt)
session.subscribe(event => emit with parentToolUseId) — parentToolUseId threaded as options.toolId
await session.prompt(task) + await completionPromise

See Subagents for the data shape and SubagentJsonlWriter details.

System prompt composition at loop start

Both engines compose their system prompt per query. Each appends roughly the same content but uses very different mechanisms.

Claude SDK
Pi

Built in handler.mjs:2973-3180:

Base: getDefaultSystemPrompt() returns {type:'preset', preset:'claude_code', append: <orion personality from prompts/system/*.md via loader.mjs>}
+ Top-level agent body: resolveTopLevelAgentBody(agentSlug) appends \n\n${agent AGENTS+SOUL+MEMORY body} (D9 invariant for top-level chat sessions)
+ Stale-session retry context (if applicable): buildContextSummary(conversationId) appended
+ Plugin connector prompt: buildConnectorPrompt(allConnectors) from plugins/connector-resolver.mjs
+ Web search guidance: ~50 lines documenting mcp__exa__* and mcp__firecrawl__* (when enabled)
+ Vault path override: “PARA Workspace Root” note pinning paths when orionHome !== ~/Orion

Result: sdkOptions.systemPrompt = { type:'preset', preset:'claude_code', append: <accumulated string> }. The SDK prepends its built-in claude_code preset (CLI engine internal) and uses append as the appendage.Skills: loaded natively by SDK from CLAUDE_CONFIG_DIR/skills/ (= prompts/skills/ via session-config-dir symlink). See Skills.Agents: sdkOptions.agents = {...folderLoaderAgents, ...options.agents} (handler.mjs:1615-1618). AgentDefinition.prompt field is AGENTS.md-only per D9 invariant.

Built by buildOrionSystemPrompt:

Identity: “You are Orion Butler…”
Tools list: hardcoded CODING_TOOLS + customToolDocs
Guidelines: “Prefer grep/find/ls over bash…”
Personality: from loadFullSystemPrompt(promptsDir, {engine:'pi'}) — loads prompts/system/*.md sorted alphabetically, skips 08-tools.md (Claude-SDK MCP-tool docs, Pi-incompatible)
Active Agent Identity: agentBody (D9-filtered AGENTS+SOUL+MEMORY when agentSlug resolved)
Agent catalog: compact list for spawn_subagent discovery
MCP proxy section: teaches the model the mcp({tool, args}) / mcp({search}) / mcp({describe}) parameter modes (Pi uses a single mcp proxy tool)
Vault path override
Cross-engine context: when switching from Claude → Pi mid-conversation

Result: a single string fed via resourceLoader.systemPromptOverride: () => finalPrompt, then await resourceLoader.reload() materializes it. The Pi SDK’s _baseSystemPrompt is set from this; the SDK’s before_agent_start rebuilds the per-turn system prompt from _baseSystemPrompt (so first-turn context persists across reused sessions).

Per-turn token accounting

Claude SDK
Pi

Per-turn input tokens from stream_event.message_start.usage.{input_tokens, cache_creation_input_tokens, cache_read_input_tokens} — LAST turn wins in multi-turn queries (represents actual context window at the end)
Per-turn output tokens from stream_event.message_delta.usage.output_tokens → emits usageUpdate IPC
Aggregate from terminal result.usage
Real-time context display: emits contextTokens IPC on each message_start
Cache cues: cache creation/read tokens surfaced separately on every event
Cost: message.total_cost_usd from terminal result (billed worker-side via proxy-* prefix)
Langfuse: one api-turn-{N} generation span per message_start, closed on next message_start. One thinking-block span per content_block_stop.

Per-turn breakdown: NONE — only session.getSessionStats() polled at end (usage.input_tokens, usage.output_tokens)
Reused-session delta: prePromptStats captured before session.prompt(), subtracted from post-stats. Required because getSessionStats() is cumulative.
Cost: rawStats.cost from Pi SDK; billed via reportPiUsage({requestIdPrefix:'cron-pi'}) for cron path
Caching: implicit (Gemini implicit cache, Anthropic-via-Pi explicit) — Pi SDK handles internally
Langfuse: single pi-agent-{provider}/{model} span per query — no per-turn breakdown (asymmetry vs Claude)
No contextTokens IPC during streaming (asymmetry)

Model swaps mid-loop

Scenario	Claude SDK	Pi
Mid-loop swap	No — model fixed for query duration	No at engine level, but `switch_model` tool emits `modelChange` IPC
Between-query swap	Yes — `routeModel(payload.model)` may return different engine	Same — `routeModel()`
Subagent model fallback	N/A (SDK handles internally)	`buildModelCandidates(primaryModel, agentDef.fallbackModels)` — between-attempt retry
Heartbeat fallback (cron)	Auto-Claude-Haiku fallback on Pi auth/safety/EMPTY_QUERY_RESULT — see Background	Same — `runHeartbeatWithFallback`

Open questions

Q1: pruneOldToolResults vs Pi compaction overlap?

pi-hook-adapter.mjs:534 calls pruneOldToolResults(messages, keepTurns=5) on every pi.on('context'). Pi SDK ALSO has its own compaction via SettingsManager.inMemory({compaction:{enabled:true, reserveTokens:12000, keepRecentTokens:30000}}). Looks complementary (per-call pruning of old tool_result blobs vs threshold-triggered compaction of message history) but not verified.

Q2: Does Task tool truly block parent's next LLM call?

Asserted blocking based on SDK semantics, but SDK’s internal scheduling not traced. If the SDK can interleave the parent’s next API call with a Task-spawned child’s work, the loop semantics change.

Q3: pi-live-session-cache TTL / eviction

pi-live-session-cache.mjs governs in-memory AgentSession reuse. Provider mismatch invalidates. TTL/eviction-after-N rules not read in this audit.

Q4: api-turn-N span tool attribution

Per-turn Langfuse spans created on each message_start. Tool spans currently created as top-level under the root chat-query agent span. Whether tools called within a turn correctly nest under that turn’s api-turn-N span needs verification.

Q5: Pi compaction_start trigger conditions

Pi SDK auto-compaction trigger conditions not verified. The sidecar handles compaction_start/end events; whether they fire on auto-compaction, manual /compact, or both is unclear.

Key files

src-tauri/sidecar/query/handler.mjs

handleQueryInternal (L1349-1530), iterateQuery (L1188-1294), sdkOptions build (L2685-2949), programmatic hooks (L2789-2939), system prompt resolution (L2973-3180), dual-engine dispatch (L3238-3550).

src-tauri/sidecar/query/message-processor.mjs

Per-message dispatch (L986-1086), assistant message handler (L97+), stream event handler (L303+), user message handler (L538+), result handler (L729+), system message handler (L834+).

src-tauri/sidecar/engine/pi-query-lifecycle.mjs

runPiAgentQuery (L56-505) — AbortController, task-run row, empty-result guard, Langfuse span, title gen.

src-tauri/sidecar/engine/pi-session.mjs

piAgentQuery (L232-1223) — session create/reuse, sandboxed tools, ResourceLoader/SettingsManager, session.subscribe → IPC translation (L596-994), prompt + agent_end await (L1002-1005), text-recovery fallback + stats (L1007-1149).

src-tauri/sidecar/engine/pi-subagent-runner.mjs

runSubagent (L922-1166), runSingleAttempt (L324-680). Depth guard, archive check, model fallback loop, fresh in-process AgentSession with triple-patch system prompt override.

src-tauri/sidecar/engine/pi-hook-adapter.mjs

createOrionHookExtension (L300-547) — the bridge that translates Pi events into Claude-style hook events. loadHooksConfig, runHookGroup, runHookScript.

src-tauri/sidecar/engine/pi-hooks.mjs

Pi tool middleware (L1-234) — env-isolation, project-detect, lifecycle (turn count). Wraps tool.execute.

src-tauri/sidecar/engine/pi-system-prompt.mjs

buildOrionSystemPrompt (L69-220), buildMcpProxySection — Pi’s parallel to the Claude prompt assembly.

src-tauri/sidecar/prompts/loader.mjs

loadSystemPromptParts, loadToolDescriptions, loadFullSystemPrompt (engine-aware skip-list), loadReminderTemplate.

Hooks

The full hook system — 22 programmatic + declarative, IPC contract, security carve-outs.

Subagents

SubagentJsonlWriter, parent_tool_use_id propagation, D9 body filter, three Pi dispatch modes.

​Loop sequence — both engines side-by-side

​The core loop in code

​Hooks within the loop

​The Pi hook adapter

​Subagents within the loop

​System prompt composition at loop start

​Per-turn token accounting

​Model swaps mid-loop

​Open questions

​Key files

​Next

Hooks

Subagents

Loop sequence — both engines side-by-side

The core loop in code

Hooks within the loop

The Pi hook adapter

Subagents within the loop

System prompt composition at loop start

Per-turn token accounting

Model swaps mid-loop

Open questions

Key files

Next