This page covers what actually happens between “user sends a message” and “agent finishes.” The Two Engines page covered routing and parity at a high level. This page traces the iteration loop step-by-step in each engine and shows where every other system (hooks, subagents, skills, system prompts) plugs in.
The two engines run the iteration loop in fundamentally different process models. Claude SDK spawns cli.js as a subprocess and the sidecar observes it via an AsyncIterable<SDKMessage>. Pi runs AgentSession in-process and the sidecar observes via session.subscribe(). Same agent semantics, very different code.

Loop sequence — both engines side-by-side

The core loop in code

The loop body in handler.mjs:1188-1294:
for await (const message of q) {
  // CRITICAL: intercept terminal-error result BEFORE processSdkMessage emits it
  if (message.type === "result" && message.subtype !== "success") {
    const err = new Error(...);
    err.sdkSubtype = message.subtype;
    throw err;  // outer catch can stale-session-retry
  }
  processSdkMessage(id, message, startTime, trace, traceId, {
    activeToolSpans, langfuseEnabled, startObservation,
    contextTokenTracker, responseTextAccumulator, apiTurnTracker,
    thinkingAccumulator, model, queryContext, sessionTracker,
    memoryInvoke, subagentWriter,
  });
  if (message.type === "result" && message.subtype === "success") {
    resultData = { sessionId, costUsd, inputTokens, ..., stopReason };
  }
}
The SDK subprocess (cli.js) is the actual agent loop. It calls Anthropic’s /messages endpoint, receives tool_use blocks, executes tools (sandboxed Bash, MCP, etc.), submits tool_results, and repeats until stop_reason !== 'tool_use'. The sidecar’s iterateQuery is purely a translator/observer.Per-message dispatch (message-processor.mjs:986-1086) handles:
  • assistanthandleAssistantMessage (tool_use blocks → emit toolStart)
  • stream_event → text/thinking deltas, usage tracking, api-turn spans
  • user → tool_result blocks → emit toolComplete
  • result → terminal complete IPC with stop_reason
  • system → init, compact_boundary, task_started, hook_started, etc.

Hooks within the loop

Every hook event has an insertion point in the loop. Here’s the full table:

The Pi hook adapter

Pi can’t natively consume Claude’s hook config. The bridge is pi-hook-adapter.mjs::createOrionHookExtension — an Extension factory that gets wired in via DefaultResourceLoader.extensionFactories. It listens to Pi’s native events (pi.on('tool_call'), pi.on('before_agent_start'), etc.) and translates them onto the 7 Claude-style hook events. The SAME prompts/settings.json config feeds both engines. Pi-only hooks (no Claude analog):
  • pi.on('context')pruneOldToolResults(messages, keepTurns=5) — runs on every context refresh
  • Tool middleware via applyMiddleware(tools, [env-isolation, project-detect, lifecycle]) — wraps tool.execute (different from PreToolUse which fires at the SDK layer)
See Hooks for the full hook execution model.

Subagents within the loop

Subagent dispatch is parent-blocking on both engines — the parent agent cannot make its next LLM call until the child returns.
Native Task/Agent tool registered by the SDK preset. The SDK subprocess handles spawn entirely; the sidecar observes:
  • assistant.tool_use w/ name=TaskhandleAssistantMessage (message-processor.mjs:191-206) detects via isTaskTool(block.name) and emits subagentStart IPC
  • system w/ subtype: 'task_started' → emits taskStarted IPC with subagent metadata
  • system w/ subtype: 'task_progress' (background tasks) → emits taskProgress IPC
  • parent_tool_use_id set natively on every assistant/user message during child run — sidecar forwards on toolStart/toolComplete
  • user.tool_result w/ tool_use_id === Task block.id → emits subagentComplete
  • subagentWriter (SubagentJsonlWriter at handler.mjs:2168-2170) persists every event with non-null parentToolUseId to .orion/subagent-events/{conversationId}/events.jsonl
See Subagents for the data shape and SubagentJsonlWriter details.

System prompt composition at loop start

Both engines compose their system prompt per query. Each appends roughly the same content but uses very different mechanisms.
Built in handler.mjs:2973-3180:
  1. Base: getDefaultSystemPrompt() returns {type:'preset', preset:'claude_code', append: <orion personality from prompts/system/*.md via loader.mjs>}
  2. + Top-level agent body: resolveTopLevelAgentBody(agentSlug) appends \n\n${agent AGENTS+SOUL+MEMORY body} (D9 invariant for top-level chat sessions)
  3. + Stale-session retry context (if applicable): buildContextSummary(conversationId) appended
  4. + Plugin connector prompt: buildConnectorPrompt(allConnectors) from plugins/connector-resolver.mjs
  5. + Web search guidance: ~50 lines documenting mcp__exa__* and mcp__firecrawl__* (when enabled)
  6. + Vault path override: “PARA Workspace Root” note pinning paths when orionHome !== ~/Orion
Result: sdkOptions.systemPrompt = { type:'preset', preset:'claude_code', append: <accumulated string> }. The SDK prepends its built-in claude_code preset (CLI engine internal) and uses append as the appendage.Skills: loaded natively by SDK from CLAUDE_CONFIG_DIR/skills/ (= prompts/skills/ via session-config-dir symlink). See Skills.Agents: sdkOptions.agents = {...folderLoaderAgents, ...options.agents} (handler.mjs:1615-1618). AgentDefinition.prompt field is AGENTS.md-only per D9 invariant.

Per-turn token accounting

  • Per-turn input tokens from stream_event.message_start.usage.{input_tokens, cache_creation_input_tokens, cache_read_input_tokens}LAST turn wins in multi-turn queries (represents actual context window at the end)
  • Per-turn output tokens from stream_event.message_delta.usage.output_tokens → emits usageUpdate IPC
  • Aggregate from terminal result.usage
  • Real-time context display: emits contextTokens IPC on each message_start
  • Cache cues: cache creation/read tokens surfaced separately on every event
  • Cost: message.total_cost_usd from terminal result (billed worker-side via proxy-* prefix)
  • Langfuse: one api-turn-{N} generation span per message_start, closed on next message_start. One thinking-block span per content_block_stop.

Model swaps mid-loop

ScenarioClaude SDKPi
Mid-loop swapNo — model fixed for query durationNo at engine level, but switch_model tool emits modelChange IPC
Between-query swapYes — routeModel(payload.model) may return different engineSame — routeModel()
Subagent model fallbackN/A (SDK handles internally)buildModelCandidates(primaryModel, agentDef.fallbackModels) — between-attempt retry
Heartbeat fallback (cron)Auto-Claude-Haiku fallback on Pi auth/safety/EMPTY_QUERY_RESULT — see BackgroundSame — runHeartbeatWithFallback

Open questions

pi-hook-adapter.mjs:534 calls pruneOldToolResults(messages, keepTurns=5) on every pi.on('context'). Pi SDK ALSO has its own compaction via SettingsManager.inMemory({compaction:{enabled:true, reserveTokens:12000, keepRecentTokens:30000}}). Looks complementary (per-call pruning of old tool_result blobs vs threshold-triggered compaction of message history) but not verified.
Asserted blocking based on SDK semantics, but SDK’s internal scheduling not traced. If the SDK can interleave the parent’s next API call with a Task-spawned child’s work, the loop semantics change.
pi-live-session-cache.mjs governs in-memory AgentSession reuse. Provider mismatch invalidates. TTL/eviction-after-N rules not read in this audit.
Per-turn Langfuse spans created on each message_start. Tool spans currently created as top-level under the root chat-query agent span. Whether tools called within a turn correctly nest under that turn’s api-turn-N span needs verification.
Pi SDK auto-compaction trigger conditions not verified. The sidecar handles compaction_start/end events; whether they fire on auto-compaction, manual /compact, or both is unclear.

Key files

handleQueryInternal (L1349-1530), iterateQuery (L1188-1294), sdkOptions build (L2685-2949), programmatic hooks (L2789-2939), system prompt resolution (L2973-3180), dual-engine dispatch (L3238-3550).
Per-message dispatch (L986-1086), assistant message handler (L97+), stream event handler (L303+), user message handler (L538+), result handler (L729+), system message handler (L834+).
runPiAgentQuery (L56-505) — AbortController, task-run row, empty-result guard, Langfuse span, title gen.
piAgentQuery (L232-1223) — session create/reuse, sandboxed tools, ResourceLoader/SettingsManager, session.subscribe → IPC translation (L596-994), prompt + agent_end await (L1002-1005), text-recovery fallback + stats (L1007-1149).
runSubagent (L922-1166), runSingleAttempt (L324-680). Depth guard, archive check, model fallback loop, fresh in-process AgentSession with triple-patch system prompt override.
createOrionHookExtension (L300-547) — the bridge that translates Pi events into Claude-style hook events. loadHooksConfig, runHookGroup, runHookScript.
Pi tool middleware (L1-234) — env-isolation, project-detect, lifecycle (turn count). Wraps tool.execute.
buildOrionSystemPrompt (L69-220), buildMcpProxySection — Pi’s parallel to the Claude prompt assembly.
loadSystemPromptParts, loadToolDescriptions, loadFullSystemPrompt (engine-aware skip-list), loadReminderTemplate.

Next

Hooks

The full hook system — 22 programmatic + declarative, IPC contract, security carve-outs.

Subagents

SubagentJsonlWriter, parent_tool_use_id propagation, D9 body filter, three Pi dispatch modes.