Memory - Orion Architecture

“Memory” in Orion is overloaded — five architecturally distinct surfaces serve different purposes. Reading this page top-to-bottom is the only way to keep them straight.

The five surfaces

#	Surface	Storage	Written by	Read by	Engine	Lifetime
1	SDK auto-memory	`{projectDir}/.orion/memory/MEMORY.md` + topic files	Claude Agent SDK (autonomous)	SDK on subsequent turns	Claude SDK only	Durable, project-scoped
2	Agent seed memory	`{vault}/agents/{slug}/MEMORY.md` + `SOUL.md`	Human (developer / user)	Loader at top-level spawn	Both engines (with D9 filter for subagents)	Durable, agent-scoped
3	Sidecar knowledge base	SQLite `memory_chunks` + `memory_vec` (384-dim)	Rust indexer + cron pipelines	`knowledge_search` MCP tool	Both	Durable, vault-scoped
4	Project / CLI instructions	`{project}/CLAUDE.md`, `{vault}/AGENTS.md`, `GEMINI.md`, `QWEN.md`	Human	SDK via `settingSources`, CLI walk-up	Varies	Durable, project- or vault-scoped
5	In-conversation history	Claude: `prompts/projects/{convId}/`. Pi: in-memory only.	Engine	Engine (continue / resume)	Both (with persistence gap on Pi)	Session-persistent on Claude; in-memory on Pi

Surface 1 — SDK auto-memory

What it is: a memory system built into the Claude Agent SDK that autonomously decides what’s worth keeping from a session and writes it to MEMORY.md (plus topic-specific markdown files) in a configured directory. Orion does not control the writes — the SDK does. Configured at handler.mjs:2714:

autoMemoryEnabled: true,
autoMemoryDirectory: userProjectDir
  ? join(userProjectDir, ".orion", "memory")
  : join(process.env.ORION_VAULT_ROOT || orionHome, ".orion", "memory")

Project sessions: writes under {projectDir}/.orion/memory/
Ad-hoc sessions: writes under {vaultRoot}/.orion/memory/

The hardcoded SDK loader cap. The SDK’s internal cli.js (minified) names the loader variables o2 = "MEMORY.md" and uj = 200 — a hardcoded 200-line truncation cap on the loaded MEMORY.md. This is a documented spike finding in the project memory; it cannot be configured from Orion. If your MEMORY.md grows past 200 lines, the SDK silently drops the rest. ⚠️ Unverifiable from source without deobfuscation; treat as confirmed-by-spike, not source-cited.

Pi has no equivalent of this surface. pi-session.mjs has no autoMemoryEnabled or autoMemoryDirectory config — grep confirms zero references to MEMORY or autoMemory in the Pi session file. Pi sessions use SettingsManager.inMemory() (pi-session.mjs:519) — short-lived, no disk persistence, no cross-turn accumulation. This is intentional isolation: Pi sessions are noPromptTemplates: true to prevent leaking user-global Claude Code / Pi-personal configs into Orion runtime.

Surface 2 — Agent seed memory (MEMORY.md + SOUL.md)

What it is: human-authored static files bundled with each agent folder. These never change unless a human edits them — no cron job, no LLM, no autonomous process writes to them. Layout (per the folder-per-agent architecture):

{layer}/agents/{slug}/
  AGENTS.md       # required — system prompt body
  config.yaml     # required — typed config
  SOUL.md         # optional — persona / voice (top-level only)
  MEMORY.md       # optional — static seed memory v1 (top-level only)

Two layers exist:

Layer	Path	Mutability
Built-in	`prompts/agents/{slug}/`	Read-only at runtime; updated by app upgrades
User vault	`{vault}/agents/{slug}/`	User-editable; travels with vault sync

Vault overrides bundle entirely (no merge). When a slug exists in both layers, the vault entry replaces the built-in. Resolution is per-query, no caching (loader.mjs:30: “D10: per-query rescan, no watcher”).

The D9 security invariant

Top-level chat sessions receive AGENTS + SOUL + MEMORY. Subagents (helpers spawned by another agent) receive AGENTS only — SOUL and MEMORY are stripped. The reason: when an agent runs as someone else’s helper, its persona and seed memory would pollute the calling context. The persona doesn’t apply (you’re a sub-step, not the user’s primary collaborator). The seed memory could include privileged information that doesn’t belong in another agent’s view. Three enforcement points, all verified:

#	Enforcement point	File:line	Mechanism
1	Claude SDK agents map	`claude-agents-map.mjs:98-102`	`assembleAgentBody(ws, { isSubagent: true })` for every entry → AGENTS.md only
2	Pi subagent runner	`pi-subagent-runner.mjs:931`	`getAgentDefForSubagent()` from `pi-agent-loader.mjs:464-480`
3	`agent_info` discovery tool	`pi-tools.mjs:1395-1445`	Routes through `getAgentDefForSubagent` — fixed at commit `d2a26640` (VAL-009, 2026-05-05)

The third point matters more than it looks: if a future “agent discovery” tool exposes agent metadata to a subagent’s context (so a subagent can learn about its peers), and it calls the non-filtered getAgentDef(), SOUL and MEMORY content leaks. The rule documented in Agents as first-class entities is explicit: discovery tools must use getAgentDefForSubagent. Any new code that touches this surface must be audited against the rule.

Engine parity

Both engines call the same shared code path for both top-level and subagent dispatch. No engine-specific divergence on this surface.

Surface 3 — Sidecar knowledge base

What it is: a local vector + FTS5 search index over everything Orion has indexed (PARA entities, sessions, heartbeats, consolidations). The knowledge_search MCP tool exposes it.

Schema (12 tables, migration v22)

Defined in src-tauri/src/memory/schema.rs. Key tables:

Table	Purpose
`memory_chunks`	Primary storage. Text + metadata + raw embedding JSON
`memory_vec`	sqlite-vec virtual table: `vec0(embedding float[384])`
`memory_fts`	FTS5 virtual table for keyword search
`memory_embedding_cache`	Dedup cache, composite PK
`memory_files`	Tracks indexed files for delta detection (hash-based)
`memory_consolidations`	Cross-entity synthesized insights (v19)
`memory_edges`	Typed weighted edges between entities (v20)
`entity_changes`	Append-only PARA mutation log (v21)
`page_versions`	Hash-deduped CLAUDE.md version trail (v22)

memory_chunks notable columns: source ('learning' | 'heartbeat' | 'consolidation' | …), memory_type ('HEARTBEAT_INSIGHT' | 'USER_PREFERENCE' | …), summary, extracted_entities, topics, importance, confidence, embedding_source ('fastembed' | 'gemini-multimodal'), enrichment_model, enriched_at.

Embedding model

Multilingual-E5-Small, 384 dimensions. Loaded via fastembed-rs (embeddings.rs docstring confirms EmbeddingModel::MultilingualE5Small, ~23 MB, cached at ~/.cache/fastembed/). The dimension constant is at indexer.rs:41: pub const MEMORY_VEC_DIMS: usize = 384;. Provider chain — embeddings.rs defines an EmbeddingProvider trait with a with_providers(Vec<Box<dyn EmbeddingProvider>>) factory. LocalProvider is the default (fastembed). embeddings_cloud.rs has a GeminiMultimodalProvider for media files. Multimodal embeddings are truncated and L2-renormalized to 384 dims to coexist with text embeddings in the same memory_vec column (indexer.rs:32-41).

Write path

Invariant M4: Rust is the sole writer of memory_chunks. Node-side code (memory-indexer.mjs, embed-after-store.mjs) emits IPC events; Rust performs all DB writes. This prevents concurrent-writer races.

Read path — `knowledge_search`

Both engines call invoke("memory_search", {...}) through memory/invoke.mjs:20-31. Rust performs ANN over memory_vec + keyword scoring over memory_fts + metadata filtering on memory_chunks, returns top-K scored results. Parameter feature gap by engine:

Parameter	Claude SDK tool (`memory/tools.mjs:294-324`)	Pi direct tool (`pi-tools.mjs:178-220`)
`query`	✓	✓
`maxResults` (default 5)	✓	✓
`source` (filter)	✓	✗
`entityId` (filter)	✓	✗
`minImportance` (filter)	✓	✗
`topics` (relevance boost)	✓	✗
`includeConsolidations` (1.2× boost on `memory_consolidations`)	✓	✗
`includeRelated` (1-hop edge traversal via `memory_edges`, strength > 0.2, 0.2× boost)	✓	✗

Pi agents get the simplified 2-param version. The advanced filters are a Claude-only feature today. See Tools for the bridging dedup that drops mcp__orion__knowledge_search on Pi.

Surface 4 — Project / CLI instructions

What it is: markdown files that provide project-level behavioral context. Not vector-indexed — injected verbatim into the context window.

CLAUDE.md (SDK auto-load)

handler.mjs:2727 sets settingSources: ['user', 'project']. The 'project' source causes the SDK to walk up from effectiveCwd (PARA project dir or vault root) looking for CLAUDE.md. The .git/ isolation boundary at the vault root stops the walk from reaching ~/.claude/ (so the user’s personal Claude Code skills and hooks are NOT loaded into Orion runtime).

CLAUDE.md (cron explicit load)

cron/executor.mjs:1085-1185 resolveProjectCwd(projectId) reads Projects/{slug}/CLAUDE.md explicitly when a cron job has payload.sessionMode === 'project'. Contents are appended to the cron system prompt. This is a parallel path to the SDK’s native discovery — cron jobs don’t always run with the right cwd to trigger SDK auto-discovery, so this is the belt-and-braces version.

AGENTS.md / GEMINI.md / QWEN.md (external CLIs)

ccw/cli/executor.mjs:177-190:

Each CLI walks UP from cwd looking for its instruction file: Codex → AGENTS.md, Gemini → GEMINI.md, Qwen → QWEN.md.

These files are placed at the vault root. They are NOT loaded by either Claude SDK or Pi directly — they’re picked up by the external CLIs spawned via delegate_to_cli. Distinct from per-agent AGENTS.md (Surface 2).

Pi engine and CLAUDE.md (unverified)

Pi sessions use noPromptTemplates: true (pi-session.mjs:500) — this disables Pi’s native prompt-template discovery (preventing skill/hook leakage from ~/.pi/, ~/.claude/, project root). But CLAUDE.md isn’t a “prompt template” by Pi’s classification; it’s read by the SDK’s project-context system. Pi sessions use SettingsManager.inMemory() rather than the SDK session manager, so the SDK’s project-context walk-up likely does not fire for Pi. ⚠️ Unverified — should be tested. Practically: agents that rely on CLAUDE.md context may behave differently on Pi vs Claude. If you need an agent to always see project rules, use Surface 2 (agent MEMORY.md) instead.

Surface 5 — In-conversation history

Claude SDK — sessions persisted to prompts/projects/{conversationId}/ (SDK session JSON). On follow-up turns, handler.mjs passes sdkOptions.continue = true and the SDK replays the full conversation. Survives sidecar restart. Pi — SettingsManager.inMemory() (pi-session.mjs:519). No disk persistence. If a conversationId is provided to follow-up queries, Pi uses it for context continuity inside the live session, but the actual mechanism is in-memory only. On sidecar restart, all Pi session context is lost. This is the same isolation rationale as Surface 1 — Pi sessions are intentionally short-lived.

Cron-driven memory operations

Five processes interact with the knowledge base. Two run in-process (no LLM), three call out to LLMs.

Memory indexer (in-process, no LLM)

File: cron/memory-indexer.mjs
Schedule: reactive via Vault Observer on file events (1-second per-path debounce) + 6-pass deep-scan interval + safety-net fallback
Reads: PARA filesystem (Projects, Areas, Resources, Archive, Inbox)
Writes: emits memoryChangesDetected IPC events. Rust does the DB writes.

Enrichment pipeline (Gemini direct)

File: cron/enrichment-pipeline.mjs
Schedule: event-driven (Vault Observer, 1s debounce) + 12h safety-net (SAFETY_NET_INTERVAL_MS)
Reads: unenriched memory_chunks via invoke("memory_get_unenriched")
Writes: enrichment fields (summary, extracted_entities, topics, importance, connections) back to memory_chunks. For media chunks: sends file bytes as inlineData to Gemini for description.
Model: enrichment_model preference (default google/gemini-2.5-flash). Calls @google/genai SDK directly, not Pi engine or Claude SDK. Non-Google models fall back to gemini-2.5-flash via toGoogleGenAIModel() (enrichment-pipeline.mjs:65).

Consolidation engine (Gemini direct)

File: cron/consolidation-engine.mjs
Schedule: 6-hour timer (CONSOLIDATION_CADENCE_INTERVAL_MS = 6 * 60 * 60 * 1000)
Reads: enriched chunks from the last 7 days (ENRICHED_WINDOW_DAYS = 7). Groups by topic cluster (Jaccard > 0.3). Only consolidates clusters with ≥3 chunks from ≥2 entities (MIN_CLUSTER_SIZE, MIN_CLUSTER_ENTITIES).
Writes: cross-entity synthesized insights to memory_consolidations. These surface in knowledge_search results with a 1.2× relevance boost (Claude-side filter).
Model: consolidation_model preference (default google/gemini-2.5-flash).

Heartbeat (Pi engine by default)

File: cron/heartbeat.mjs
Schedule: per-entity cadence (Projects and Areas only, per classifyHeartbeatPath() at :120), gated by _activityMap — only runs if the entity has filesystem activity since the last check (Strategy C, TIGER-3)
Reads: prior memory via invoke("memory_search", { query, entityId, source: 'learning' }) to enrich the heartbeat prompt
Writes: non-suppressed alert text stored as memory_chunks row with memoryType: 'HEARTBEAT_INSIGHT' via invoke("memory_store", {...}) (heartbeat.mjs:646)
Model: heartbeat_model preference (default google/gemini-2.5-flash), routed via Pi engine via executeIsolated → routeModel

Fallback contract: heartbeat.mjs:797-910 runHeartbeatWithFallback(). Triggers on:

/no model found|no auth|api key|unauthorized/i  // auth/safety errors
/EMPTY_QUERY_RESULT|safety/i                     // Gemini safety filter returned no text

When triggered: retry on BACKGROUND_MODEL_FALLBACKS.heartbeat_model (= claude-haiku-4-5-20251001), stripping the effort parameter (Pi-specific). Gated by heartbeat_fallback_enabled preference (default true). This is the only cron job with documented graceful degradation. Other Pi-routed jobs (enrichment, consolidation) call Gemini directly and surface errors via invoke rather than falling back.

Dream consolidator (v1: in-process, NO LLM)

File: cron/dream-consolidator.mjs
Schedule: nightly via _armDreamTimer() in cron-service.mjs:545. Timer is always armed, but the dream job is gated by dream_enabled preference (default 'false').

The dream_model user preference is wired but never invoked in v1. constants.mjs:77-78 reads the preference at startup so it’s stable when Phase 4 lands. But dream-consolidator.mjs:17 is explicit: “v1 is DETERMINISTIC: ZERO LLM calls. The dream_model user preference is READ at construction … and stored on the instance as _dreamModelPref for Phase 4 — it is NEVER invoked in v1.” If a user sets the preference today, it has no effect.

What v1 actually does (when enabled):

Phase	Action
Orient	SQL census, read-only counts
Prune	Delete orphan `memory_chunks` (path not in `memory_files`), orphan `memory_vec` rows, stale inferred edges (`source='inferred'` AND `created_at < date('now', '-180 days')`), old embedding cache rows (`updated_at < date('now', '-30 days')`)

No LLM-based synthesis. That’s Phase 4 (BRS-memory-evolution).

Hook injections

Memory data also reaches the model via hooks fired on every turn, not just via tools.

Hook	When	What it injects
`hooks/context/memory-recall.mjs`	`UserPromptSubmit`	FTS5 recall + async enrichment reader (2-min TTL on the enrichment cache)
`hooks/context/para-context-hook.mjs`	`UserPromptSubmit`	Reads `memory-enrichment/{conversationId}.json` (written by prior turn’s enrichment) and injects into context

The enrichment cache is a per-conversation file. On every turn, para-context-hook reads any enrichment results computed since the last turn and injects them. ⚠️ Whether these hooks fire for Pi sessions is unverified — the hook pipeline is SDK-level. If Pi bypasses the hook system, agents on Pi routes don’t receive the async enrichment context.

Dual-engine parity matrix

Surface	Claude SDK	Pi
1. SDK auto-memory (`MEMORY.md` autonomous)	✓	✗ — no equivalent config
2. Agent MEMORY.md (top-level)	✓ (via `resolveTopLevelAgentBody`)	✓ (same shared code path)
2. Agent MEMORY.md (as subagent — D9)	Stripped	Stripped
2. Agent SOUL.md (same rules)	Stripped for subagents	Stripped for subagents
3. `knowledge_search` basic (query, maxResults)	✓	✓
3. `knowledge_search` advanced (`minImportance`, `topics`, `includeConsolidations`, `includeRelated`)	✓	✗ — Pi tool exposes 2 of 7 params
3. Sidecar KB writes	✓	✓ (same Rust IPC bridge)
4. CLAUDE.md auto-discovery	✓ (via `settingSources: ['project']`)	⚠️ Unverified — Pi uses `noPromptTemplates: true` + in-memory SessionManager
4. AGENTS.md / GEMINI.md / QWEN.md	n/a (external CLIs read these)	n/a
5. Conversation history persistence	✓ (on disk in `prompts/projects/`)	✗ — in-memory only, lost on restart
Heartbeat writes to memory	Default Pi route	Default Pi route (same destination)
Memory enrichment cache hook	✓ via `para-context-hook`	⚠️ Unverified whether Pi triggers SDK hooks

Honest gaps

#	Gap	Severity	Where
M1	Pi auto-memory. No equivalent of Surface 1 on Pi. Sessions don’t accumulate learnings.	Medium	`pi-session.mjs` (zero `autoMemory*` references)
M2	Pi advanced search. `knowledge_search` on Pi takes 2 params; Claude version takes 7.	Medium	`pi-tools.mjs:178-220`
M3	`dream_model` preference is a UI control with no effect. v1 dream-consolidator does ZERO LLM calls; the preference is read into `_dreamModelPref` for Phase 4 (not yet implemented).	Low	`dream-consolidator.mjs:17`, `constants.mjs:77-78`
M4	`memory_extractor_model` preference is a UI control with no service. Service removed in Phase B IMPL-002; constant retained for M1 symmetry + future re-enablement.	Low	`constants.mjs:86`
M5	No TTL on `memory_chunks`. Indexed content grows unbounded. Only orphan / stale-edge / old-cache rows are pruned by dream v1.	Medium	No eviction policy on actively-indexed content
M6	No deduplication for identical content from different paths. A note copied to two places creates two chunks.	Low	Store logic appears to be by `path` + `chunk-id`, not content-hash upsert (unverified)
M7	No conflict resolution. If heartbeat and enrichment write contradictory facts about the same entity, both persist. `knowledge_search` returns by similarity / keyword overlap, no arbitration.	Low	No arbitration layer
M8	CLAUDE.md on Pi unverified. Pi’s `noPromptTemplates: true` + in-memory SessionManager probably skip the SDK’s project-context walk-up. Untested.	Medium	Verify before relying on CLAUDE.md context in a Pi agent
M9	Enrichment hook injection on Pi unverified. `para-context-hook.mjs` fires on the SDK `UserPromptSubmit` event. If Pi bypasses SDK hooks, Pi sessions don’t receive async enrichment.	Medium	Verify hook pipeline coverage
M10	Pi conversation history non-persistent. Sidecar restart loses all Pi session context. Claude SDK survives.	Accepted	`pi-session.mjs:519`

​The five surfaces

​Surface 1 — SDK auto-memory

​Surface 2 — Agent seed memory (MEMORY.md + SOUL.md)

​The D9 security invariant

​Engine parity

​Surface 3 — Sidecar knowledge base

​Schema (12 tables, migration v22)

​Embedding model

​Write path

​Read path — knowledge_search

​Surface 4 — Project / CLI instructions

​CLAUDE.md (SDK auto-load)

​CLAUDE.md (cron explicit load)

​AGENTS.md / GEMINI.md / QWEN.md (external CLIs)

​Pi engine and CLAUDE.md (unverified)

​Surface 5 — In-conversation history

​Cron-driven memory operations

​Memory indexer (in-process, no LLM)

​Enrichment pipeline (Gemini direct)

​Consolidation engine (Gemini direct)

​Heartbeat (Pi engine by default)

​Dream consolidator (v1: in-process, NO LLM)

​Hook injections

​Dual-engine parity matrix

​Honest gaps

​See also

The five surfaces

Surface 1 — SDK auto-memory

Surface 2 — Agent seed memory (MEMORY.md + SOUL.md)

The D9 security invariant

Engine parity

Surface 3 — Sidecar knowledge base

Schema (12 tables, migration v22)

Embedding model

Write path

Read path — `knowledge_search`

Surface 4 — Project / CLI instructions

CLAUDE.md (SDK auto-load)

CLAUDE.md (cron explicit load)

AGENTS.md / GEMINI.md / QWEN.md (external CLIs)

Pi engine and CLAUDE.md (unverified)

Surface 5 — In-conversation history

Cron-driven memory operations

Memory indexer (in-process, no LLM)

Enrichment pipeline (Gemini direct)

Consolidation engine (Gemini direct)

Heartbeat (Pi engine by default)

Dream consolidator (v1: in-process, NO LLM)

Hook injections

Dual-engine parity matrix

Honest gaps

See also