Per-turn lifecycle
A single LLM turn produces a sequence of events that the sidecar streams to the frontend in order: Per-API-turn span construction lives atmessage-processor.mjs:733. Each turn is tracked separately for token accounting and Langfuse observability.
The mixed-stream parser
When an agent emits a widget spec, the spec doesn’t arrive as a separate event — it arrives interleaved with the assistant text, as JSONL patch lines. The mixed-stream parser separates them. Example raw text from an agent that’s building a widget:createMixedStreamParser(), initialized at machine start by initSpecStreamParser. On every CHUNK, processSpecStream(text) pushes the chunk through the parser:
- Buffer until a complete line (newline-terminated) is available.
- Try
JSON.parse(line)— if it succeeds AND hasop+path, it’s a patch. - Patch lines → applied to widget spec via
applySpecPatch, thenpushSpecto both:canvasStore.updateStreamingSpec(spec)— live canvas updateinlineWidgetStreamStore.pushSpec(spec)— inline widget in chat
- Non-patch lines → pushed to the text activity stream.
spec-stream.ts:88-130.
Why interleaving instead of separate events?
LLMs stream as a single stream of text tokens. Forcing them to emit “now I’m in widget mode” / “now I’m in text mode” structured events would require a tool-call-only widget API. The current design lets the LLM compose widgets inline with explanation — agent decides when to interleave. Trade-off: the parser must be robust to partial lines and malformed JSON. If a chunk arrives mid-line, the parser buffers until the next chunk completes the line. IfJSON.parse throws, the line is treated as text. ✓ VERIFIED at spec-stream.ts.
Text vs thinking vs tool
Three distinct event types feed three distinct render paths:- text
- thinking
- tool
- SidecarEvent:
text(camelCase) - XState event:
CHUNK - Renderer: chat bubble streaming text. Markdown rendered as it arrives (via
react-markdown). - Goes through
processSpecStreamto strip widget patch lines first.
Token streaming throughput
The streaming bandwidth is dominated bytext events. A single 1k-token response can emit hundreds of text events as the model streams tokens. Two protections against overload:
- rAF throttling in
useStreamingSession(L303-321) — at most one React render per animation frame. - String concatenation in XState reducer — text accumulates in machine context, not stored per-event. The activity stream contains one text entry per “text run” (run breaks when a non-text event interrupts).
Per-API-turn token accounting
usage events carry token counts: input_tokens, output_tokens, cache_creation_input_tokens, cache_read_input_tokens. The sidecar emits a usageUpdate SidecarEvent on:
- Session init — initial context tokens (system + tools + history).
- Per-turn end — output tokens from the just-completed turn.
- Final result — total session token spend.
proxy-* for Claude, cron-pi-* for Pi).
The frontend usageUpdate updates a tokens field in XState context. The UI shows current context % in the status bar — useful before triggering compaction.
Inline widget streaming → chat persistence
Two stores cooperate during streaming:canvasStore.streamingSpec— used byWidgetCanvasto render the live-updating widget on the canvas.inlineWidgetStreamStore.streamingSpec— used byInlineWidget(in the chat bubble) to render the live-updating inline preview.
flushSpecStreamfinalizes the widget spec.finalizeInlineWidgetStreamcommits the spec to the inline widget for that message.- On
COMPLETEspecifically,persistInlineWidgetwrites to backend storage so the widget survives session reload.
elementCount > 10, the widget also auto-opens the canvas. ≤10 elements stay inline only. (See Frontend → Canvas for the threshold and its silent-failure mode.)
Cancellation mid-stream
When the user clicks the cancel button:- Frontend dispatches
CANCELto the XState machine. - The machine transitions to
cancelledstate. streamingstate’sexitrunsflushSpecStream+finalizeInlineWidgetStream(any in-flight widget gets finalized at its current state).- Tauri IPC sends
cancel_query(requestId)to the sidecar. - Sidecar calls
gracefulKill()(Claude path) orabortController.abort()(Pi path) — see Background → Cancellation contract. - Sidecar emits a
cancelledSidecarEvent confirming the abort.
Open questions
- Concurrent widget streams — if two tool calls each emit spec patches in the same turn (interleaved across
textevents), can the parser route patches to the right widget? Or do patches assume one widget per turn? - Patch ordering across chunks — if a patch line is split across two chunks, the buffer ensures it reassembles. But if patches arrive faster than
applySpecPatchruns (synchronous), can they get applied out of order? Verified to work but worth checking under load.
Key files
src-tauri/sidecar/query/message-processor.mjs
src-tauri/sidecar/query/message-processor.mjs
Per-API-turn span construction (L733). Builds
text, thinking, toolStart, toolComplete, usageUpdate SidecarEvents from Claude SDK messages or Pi events.src/machines/streaming/actions/spec-stream.ts
src/machines/streaming/actions/spec-stream.ts
createMixedStreamParser, initSpecStreamParser (L31-50), processSpecStream (L88-130), flushSpecStream, applySpecPatch.src/machines/streaming/machine.ts
src/machines/streaming/machine.ts
The XState v5 streamingMachine. Spec-stream init at L379. CHUNK handler at L465. Exit hooks at L455. Complete handler at L498-509.
src/hooks/useStreamingSession.ts
src/hooks/useStreamingSession.ts
rAF throttling at L303-321. SessionManager + actor subscription.
src/stores/canvasStore.ts and src/stores/inlineWidgetStreamStore.ts
src/stores/canvasStore.ts and src/stores/inlineWidgetStreamStore.ts
The two destination stores for spec patches during streaming.
src-tauri/src/sidecar/events.rs
src-tauri/src/sidecar/events.rs
The Rust SidecarEvent enum — IPC contract.
Next
Event Layer
For the full SidecarEvent variant list and the Rust serde contract.
Streaming Machine
XState v5 state graph, transitions, activity-stream construction.