This page is the contract between this guide and reality. Every gap below has been verified against the Orion source on 2026-05-17. Filed as GitHub issues on the Orion repo.

How to read this page

MarkerMeaning
✗ KNOWN GAPConfirmed divergence between behavior and what a reasonable engineer might expect. Filed as an issue.
✗ STRUCTURAL GAPArchitectural absence (e.g. a hook layer that doesn’t exist for some path). Filed.
✗ BY DESIGNIntentional trade-off; documented so users understand the boundary.
✗ UNCERTAINAudit didn’t reach far enough to verify. Investigation issue filed.
✓ CLOSEDConcern that was investigated and resolved — kept here for transparency.
SeverityMeans
HighActive risk to user data, credentials, or system integrity.
Medium-HighReasonable misuse path under normal operation; protections exist but are incomplete.
MediumUser expectations diverge from behavior; misuse requires unusual usage.
Low-MediumEdge cases, doc drift, or developer-only UX papercuts.
LowCosmetic / advisory / unlikely to ever trip.

GAP-B1 — Pi sessions ignore mcp_tool_permissions user configuration

Severity: Medium · Status: ✗ KNOWN GAP · Page: MCP & Permissions
A user opens Settings → MCP Permissions and blocks mcp__orion-tools__delegate_to_cli. Sends a chat using a Pi model (Gemini). The agent invokes the tool. It executes without being blocked.
pi-permission-middleware.mjs::classifyTool() at L58-78 treats all mcp__* tools as {destructive:false, isWrite:false, dangerLevel:'low'} and does NOT consult the mcp_tool_permissions DB table.The lookupToolPermission() function in canUseTool.mjs:81-100 IS the right check — but it’s only called from the Claude SDK canUseTool closure at L238-257. The Pi middleware doesn’t delegate to that closure for MCP tools.
Two options:
  1. Add lookupToolPermission(parsed.server, parsed.tool) call inside classifyTool() for mcp__-prefixed tools — return {destructive:true} when permission === ‘block’.
  2. Make wrapWithPermission() delegate to the boundCanUseTool closure for mcp__ tools so the full createCanUseTool decision tree runs (requires threading canUseTool into middleware opts).

GAP-B2 / GAP-D1 — CCW external CLI spawns have no OS sandbox

Severity: Medium-High · Status: ✗ KNOWN GAP · Pages: MCP & Permissions, Background
An agent invokes mcp__orion-tools__execute_cli_inprocess with tool:'codex', mode:'write', dir:'/Users/sid'. Codex CLI spawns with cwd=/Users/sid, full write access to user’s home directory. No sandbox-exec, no bwrap, no path policy.
ccw/cli/executor.mjs:execute() calls spawn(commandToSpawn, argsToSpawn, { cwd: workingDir, env: spawnEnv }) with no sandbox wrapper. The Pi sandbox’s @carderne/sandbox-runtime only applies to Pi’s own Bash tool — not to spawned subprocesses.
  • buildSpawnEnv() strips 28 Orion-internal env vars (credentials, OAuth, DB paths) — executor.mjs:128-175
  • 10-minute timeout
  • Caller-supplied cwd boundary
  • Filesystem write scope (spawned CLI can write to any path accessible to sidecar)
  • Network scope (CLI uses its own auth tokens for arbitrary calls)
  • mode: 'analysis' vs 'write' enforcement (Orion passes as flag, CLI may ignore)
Wrap the spawn() call with wrapCommand() from @carderne/sandbox-runtime (same primitive Pi’s Bash sandbox uses). Mode-aware policy:
  • mode: 'analysis' → all writes denied
  • mode: 'write' → vault root + caller-supplied dir only
Requires platform detection (sandbox-runtime unavailable on Windows; fallback would be caller-supplied dir restriction only).

GAP-B3 — MCP-bridged file tools bypass denyWrite floor

Severity: Medium · Status: ✗ STRUCTURAL GAP · Page: MCP & Permissions
A Pi agent invokes mcp__orion-tools__write_file with path targeting .env (which is in the Pi sandbox denyWrite floor). The path-traversal check in the handler passes (.env isn’t a traversal attempt). The denyWrite floor is never consulted because MCP-bridged tools bypass the Pi sandbox fs-wrappers.
MCP-bridged tools call Zod handlers directly via mcp-bridge.mjs:134-146, bypassing pi-sandbox/fs-wrappers.mjs::assertSandboxPath(). Only validatePath() from shell-utils.mjs runs — and that checks control characters and path traversal but NOT the denyWrite pattern list.
Medium. The mcp__orion-tools__* tool handlers are Orion-controlled trusted code — they don’t currently abuse this. But the absence of the denyWrite enforcement layer means a future tool addition (or a third-party MCP server registered into the same namespace) could inadvertently allow credential file writes without triggering the hard-block.
Either:
  1. Wrap mcp__orion-tools__edit_file / write_file handlers with explicit assertSandboxPath() calls.
  2. Extend bridgeMcpTools() in mcp-bridge.mjs to inject sandbox checks for any tool with file-path inputs (detect via Zod schema introspection).

GAP-D2 — Plain cron jobs bypass AI-WITH-YOU gate

Severity: Medium · Status: ✗ KNOWN GAP (by design but should be documented) · Page: Background
User creates a cron job with payload.message: "Send my weekly digest to Slack" and schedule 0 9 * * MON. Job fires Monday 9am, LLM executes message, Slack MCP tool sends digest. User has no acceptance step.
The 3-layer AI-WITH-YOU gate (DDL CHECK + runtime JS branch + Rust command guard) only fires for payload.type === 'autopilot' cron jobs — they get short-circuited into autopilot/runner.mjs at executor.mjs:260.Plain payload.message cron jobs (the most common kind users create directly) skip this branch and run end-to-end on cadence.
  • Tools called by cron jobs still go through canUseTool (Claude path) or pi-permission-middleware (Pi path).
  • In bypassPermissions cron mode, the middleware skips wrapping — meaning tools run without prompts.
  • Cron jobs are visible in the UI; user can see what fired and disable.
Either:
  1. Introduce cron_jobs.acceptance_mode column mirroring autopilot’s pattern, defaulting to 'suggest' for user-created cron jobs (auto for system jobs like heartbeat).
  2. Document this prominently as expected behavior (cron = scheduled execution, not gated approval) — and ensure the cron creation UI surfaces this clearly.

GAP-A1 — Frozen system prompt on reused Pi sessions

Severity: Low-Medium · Status: ✗ KNOWN GAP (perf vs UX trade-off) · Page: Sessions
User edits an agent’s config.yaml mid-conversation, or installs a new skill. They expect the change to take effect on the next message — but it doesn’t, until the session is restarted.
Pi AgentSession cache hit at handler.mjs:2191-2234 reuses the existing session — skipping the first-turn setup at pi-session.mjs:294-591. System prompt, MCP tool list, and skill discovery results are baked in at session creation time.
Caching gives massive perf wins on follow-up turns (skip ~hundreds of ms of setup). Hot-reloading would re-run setup every turn or require explicit cache invalidation hooks. This is a deliberate trade-off — but undocumented surprises developers.
Sessions → Pi → Frozen system prompt warning. Restarting the session (close + reopen the chat) picks up changes.

GAP-C1 — ProvenanceCaption gate is advisory, not enforced

Severity: Low · Status: ✗ KNOWN GAP · Page: Frontend
An agent emits a SlideDeck, MetricCard, or Chart widget spec without a ProvenanceCaption child. The widget renders without error. The user sees AI-generated content with no source citation.
The catalog convention (widget-catalog.ts:173-186) describes the gate in the component’s description string, which feeds the LLM system prompt via catalog.prompt(). There is no runtime validator that rejects widget specs lacking provenance.Enforcement is convention-based — and convention breaks under model variance, Pi tool calls (which may not see the same prompt), and YAML specs from external sources.
Add a validateProvenance() step to the 7-step widget fix pipeline in useCanvasToolInterceptor.ts. Reject (or warn-and-flag) widgets in presentationComponents and dataComponents groups that lack ProvenanceCaption in their tree.

GAP-D4 — Claude CLI deliberately inherits user’s ~/.claude/

Severity: Low-Medium · Status: ✗ BY DESIGN · Page: Background
ccw/cli/executor.mjs:645-649 explicitly strips CLAUDE_CONFIG_DIR from the spawned Claude CLI’s env. The CLI falls back to ~/.claude/ (user’s own Claude Code config), not Orion’s runtime config.
  • User’s personal skills, hooks, settings.json, MCP servers → loaded into spawned Claude CLI.
  • Spawned CLI uses user’s OAuth tokens, not Orion’s Cloudflare-Worker-proxied tokens.
  • Billing flows through user’s Anthropic account.
This is opposite to the rest of Orion’s isolation architecture — but intentional. Orion’s CLAUDE_CONFIG_DIR=prompts/ keeps user config out of Orion’s runtime, but when Orion delegates to Claude CLI, the user expects the CLI to behave like their Claude CLI. Cross-billing wouldn’t make sense.
“Delegate to Claude CLI” is fundamentally different from “Orion’s Claude SDK” in trust posture. Users should know which they’re invoking. This guide flags it explicitly.

GAP-A8 — SubagentJsonlWriter rotation policy unverified

Severity: Low · Status: ✗ UNCERTAIN (investigation issue filed) · Page: Subagents
JSONL files at <vault>/.orion/subagent-events/<rootConversationId>/events.jsonl could grow unbounded if rotation isn’t wired. Long-running root conversations with heavy subagent use would accumulate large logs.
Audit confirmed SubagentJsonlWriter exists and is wired into Pi subagent dispatch, but didn’t verify rotation policy (size-based, time-based, or absent). Investigation issue filed.
Add size-based rotation (e.g., 10 MB cap with .1, .2 suffixes) to avoid unbounded growth.

GAP-A7 — Top-level Pi timeout for interactive sessions

Severity: Low-Medium · Status: ✗ UNCERTAIN (investigation issue filed) · Page: Sessions
Cron jobs have PI_TIMEOUT_BY_EFFORT (cron/executor.mjs:56-61). It’s unclear whether interactive Pi sessions have an equivalent top-level timeout. If absent, an LLM hang could keep the sidecar busy indefinitely.
Audit did not reach pi-session.mjs / pi-query-lifecycle.mjs timeout wiring in enough depth. Investigation issue filed.
Add timeout wrapper similar to PI_TIMEOUT_BY_EFFORT parameterized for interactive (longer ceiling than cron, e.g., 10 min).

GAP-D3 — Boot-time orphan reconciliation for cron_runs.status='running'

Severity: Unknown (potentially Medium) · Status: ✗ UNCERTAIN (investigation issue filed) · Page: Background
When the sidecar crashes mid-cron-execution, cron_runs rows with status='running' are left in the DB. On next sidecar start, those rows should be reconciled (e.g., marked failed or interrupted). If reconciliation is absent, the UI shows ghost-running jobs.
Audit confirmed cancellation contract is correct but didn’t verify boot-time reconciliation in cron-service.mjs lines 200-1800. Investigation issue filed.
On startup: UPDATE cron_runs SET status='interrupted', error='sidecar restart' WHERE status='running' AND started_at < ?.

Other gaps (compact)

Widgets with ≤10 elements render inline only. If InlineWidget isn’t mounted in current chat view, widget is invisible. No fallback notification. Severity: Low.
sheet_operation mutations set isDirty:true but don’t trigger auto-save (unlike TipTap’s 2s debounce). User must save manually. App close while dirty → changes lost. Severity: Low-Medium.
processedRef.has(toolItem.id) — if Pi’s toolComplete emits null id, canvas tools from Pi sessions could silently fail. ? INFERRED, needs verification. Severity: Low.
If createYamlStreamCompiler().flush() throws and no JSON fallback exists, the warning is logged but the user sees a blank canvas with no error message. Severity: Low.
Extension-less files (Dockerfile, Makefile, LICENSE) routed via canvasStore.openFile get extension='' and fall to fallback mode instead of correct mode. Inconsistent with file-router.ts::getCanvasModeForFile(). Severity: Low.
Claude SDK doesn’t have equivalent. Hypothetical Claude empty-response wouldn’t trigger heartbeat fallback. Low likelihood given Claude’s response model. Severity: Low.
Claude SDK sets it automatically; Pi sets it manually inside pi-subagent-runner.mjs. Equivalent in practice but Pi codepath could regress. Defensible via regression tests. Severity: Low.
Codebase uses both interchangeably in casual reading. They are NOT the same. See Glossary. Severity: Documentation.
Minor doc drift in model-router OpenRouter prefix and subagent-runner JSDoc. Needs reconciliation. Severity: Low.

Closed gaps (what IS protected)

policy-loader.mjs:82-127.env, .env.*, *.pem, *.key, **/.ssh/**, **/.aws/**, **/.gnupg/**, **/orion.db, **/com.orion.butler/**, **/prompts/projects/**, **/.orion/mcp-oauth/**. Users CANNOT remove these built-ins. ✓ VERIFIED.
Persistent tokens at {vault}/.orion/mcp-oauth/<server>/tokens.json survive oauthSingleton.shutdown() — users don’t re-authorize on every Orion launch. Cleared only on vault switch via reset({newVaultRoot}). ✓ VERIFIED.
buildSpawnEnv() strips 28 keys including ANTHROPIC_, LANGFUSE_, ORION_*, FIREBASE_UID, GEMINI_API_KEY, ORION_DB_PATH. Spawned CLIs cannot inherit Orion credentials. ✓ VERIFIED.
BAKED_IN_TRUSTED_SERVERS + internalServers Set + canUseTool L222-234. Orion’s own MCP tools don’t prompt the user — by design. ✓ VERIFIED.
Fork/merge operations update both session_index and conversations inside a single rusqlite transaction with RAII rollback on filesystem-copy failure. ✓ VERIFIED in session_lineage.rs.

Filed issues

All seven gaps filed as GitHub issues on sidart10/orion-butler on 2026-05-17, under labels safety-gap, architecture-audit, from-explainer-v3.
#IssueGapSeverity
#77Pi sessions ignore mcp_tool_permissions user configurationGAP-B1Medium
#78CCW external CLI spawns have no OS sandboxGAP-B2 / GAP-D1Medium-High
#79MCP-bridged file tools bypass Pi sandbox denyWrite floorGAP-B3Medium
#80Plain cron jobs bypass the AI-WITH-YOU acceptance gateGAP-D2Medium
#81Investigate: boot-time orphan reconciliation for cron_runsGAP-D3Unknown
#82Investigate: top-level Pi timeout for interactive sessionsGAP-A7Low-Medium
#83Investigate: SubagentJsonlWriter rotation policyGAP-A8Low