“Memory” in Orion is overloaded — five architecturally distinct surfaces serve different purposes. Reading this page top-to-bottom is the only way to keep them straight.
The five surfaces
| # | Surface | Storage | Written by | Read by | Engine | Lifetime |
|---|
| 1 | SDK auto-memory | {projectDir}/.orion/memory/MEMORY.md + topic files | Claude Agent SDK (autonomous) | SDK on subsequent turns | Claude SDK only | Durable, project-scoped |
| 2 | Agent seed memory | {vault}/agents/{slug}/MEMORY.md + SOUL.md | Human (developer / user) | Loader at top-level spawn | Both engines (with D9 filter for subagents) | Durable, agent-scoped |
| 3 | Sidecar knowledge base | SQLite memory_chunks + memory_vec (384-dim) | Rust indexer + cron pipelines | knowledge_search MCP tool | Both | Durable, vault-scoped |
| 4 | Project / CLI instructions | {project}/CLAUDE.md, {vault}/AGENTS.md, GEMINI.md, QWEN.md | Human | SDK via settingSources, CLI walk-up | Varies | Durable, project- or vault-scoped |
| 5 | In-conversation history | Claude: prompts/projects/{convId}/. Pi: in-memory only. | Engine | Engine (continue / resume) | Both (with persistence gap on Pi) | Session-persistent on Claude; in-memory on Pi |
Surface 1 — SDK auto-memory
What it is: a memory system built into the Claude Agent SDK that autonomously decides what’s worth keeping from a session and writes it to MEMORY.md (plus topic-specific markdown files) in a configured directory. Orion does not control the writes — the SDK does.
Configured at handler.mjs:2714:
autoMemoryEnabled: true,
autoMemoryDirectory: userProjectDir
? join(userProjectDir, ".orion", "memory")
: join(process.env.ORION_VAULT_ROOT || orionHome, ".orion", "memory")
- Project sessions: writes under
{projectDir}/.orion/memory/
- Ad-hoc sessions: writes under
{vaultRoot}/.orion/memory/
The hardcoded SDK loader cap. The SDK’s internal cli.js (minified) names the loader variables o2 = "MEMORY.md" and uj = 200 — a hardcoded 200-line truncation cap on the loaded MEMORY.md. This is a documented spike finding in the project memory; it cannot be configured from Orion. If your MEMORY.md grows past 200 lines, the SDK silently drops the rest. ⚠️ Unverifiable from source without deobfuscation; treat as confirmed-by-spike, not source-cited.
Pi has no equivalent of this surface. pi-session.mjs has no autoMemoryEnabled or autoMemoryDirectory config — grep confirms zero references to MEMORY or autoMemory in the Pi session file. Pi sessions use SettingsManager.inMemory() (pi-session.mjs:519) — short-lived, no disk persistence, no cross-turn accumulation. This is intentional isolation: Pi sessions are noPromptTemplates: true to prevent leaking user-global Claude Code / Pi-personal configs into Orion runtime.
Surface 2 — Agent seed memory (MEMORY.md + SOUL.md)
What it is: human-authored static files bundled with each agent folder. These never change unless a human edits them — no cron job, no LLM, no autonomous process writes to them.
Layout (per the folder-per-agent architecture):
{layer}/agents/{slug}/
AGENTS.md # required — system prompt body
config.yaml # required — typed config
SOUL.md # optional — persona / voice (top-level only)
MEMORY.md # optional — static seed memory v1 (top-level only)
Two layers exist:
| Layer | Path | Mutability |
|---|
| Built-in | prompts/agents/{slug}/ | Read-only at runtime; updated by app upgrades |
| User vault | {vault}/agents/{slug}/ | User-editable; travels with vault sync |
Vault overrides bundle entirely (no merge). When a slug exists in both layers, the vault entry replaces the built-in. Resolution is per-query, no caching (loader.mjs:30: “D10: per-query rescan, no watcher”).
The D9 security invariant
Top-level chat sessions receive AGENTS + SOUL + MEMORY. Subagents (helpers spawned by another agent) receive AGENTS only — SOUL and MEMORY are stripped.
The reason: when an agent runs as someone else’s helper, its persona and seed memory would pollute the calling context. The persona doesn’t apply (you’re a sub-step, not the user’s primary collaborator). The seed memory could include privileged information that doesn’t belong in another agent’s view.
Three enforcement points, all verified:
| # | Enforcement point | File:line | Mechanism |
|---|
| 1 | Claude SDK agents map | claude-agents-map.mjs:98-102 | assembleAgentBody(ws, { isSubagent: true }) for every entry → AGENTS.md only |
| 2 | Pi subagent runner | pi-subagent-runner.mjs:931 | getAgentDefForSubagent() from pi-agent-loader.mjs:464-480 |
| 3 | agent_info discovery tool | pi-tools.mjs:1395-1445 | Routes through getAgentDefForSubagent — fixed at commit d2a26640 (VAL-009, 2026-05-05) |
The third point matters more than it looks: if a future “agent discovery” tool exposes agent metadata to a subagent’s context (so a subagent can learn about its peers), and it calls the non-filtered getAgentDef(), SOUL and MEMORY content leaks. The rule documented in Agents as first-class entities is explicit: discovery tools must use getAgentDefForSubagent. Any new code that touches this surface must be audited against the rule.
Engine parity
Both engines call the same shared code path for both top-level and subagent dispatch. No engine-specific divergence on this surface.
Surface 3 — Sidecar knowledge base
What it is: a local vector + FTS5 search index over everything Orion has indexed (PARA entities, sessions, heartbeats, consolidations). The knowledge_search MCP tool exposes it.
Schema (12 tables, migration v22)
Defined in src-tauri/src/memory/schema.rs. Key tables:
| Table | Purpose |
|---|
memory_chunks | Primary storage. Text + metadata + raw embedding JSON |
memory_vec | sqlite-vec virtual table: vec0(embedding float[384]) |
memory_fts | FTS5 virtual table for keyword search |
memory_embedding_cache | Dedup cache, composite PK |
memory_files | Tracks indexed files for delta detection (hash-based) |
memory_consolidations | Cross-entity synthesized insights (v19) |
memory_edges | Typed weighted edges between entities (v20) |
entity_changes | Append-only PARA mutation log (v21) |
page_versions | Hash-deduped CLAUDE.md version trail (v22) |
memory_chunks notable columns: source ('learning' | 'heartbeat' | 'consolidation' | …), memory_type ('HEARTBEAT_INSIGHT' | 'USER_PREFERENCE' | …), summary, extracted_entities, topics, importance, confidence, embedding_source ('fastembed' | 'gemini-multimodal'), enrichment_model, enriched_at.
Embedding model
Multilingual-E5-Small, 384 dimensions. Loaded via fastembed-rs (embeddings.rs docstring confirms EmbeddingModel::MultilingualE5Small, ~23 MB, cached at ~/.cache/fastembed/). The dimension constant is at indexer.rs:41: pub const MEMORY_VEC_DIMS: usize = 384;.
Provider chain — embeddings.rs defines an EmbeddingProvider trait with a with_providers(Vec<Box<dyn EmbeddingProvider>>) factory. LocalProvider is the default (fastembed). embeddings_cloud.rs has a GeminiMultimodalProvider for media files. Multimodal embeddings are truncated and L2-renormalized to 384 dims to coexist with text embeddings in the same memory_vec column (indexer.rs:32-41).
Write path
Invariant M4: Rust is the sole writer of memory_chunks. Node-side code (memory-indexer.mjs, embed-after-store.mjs) emits IPC events; Rust performs all DB writes. This prevents concurrent-writer races.
Read path — knowledge_search
Both engines call invoke("memory_search", {...}) through memory/invoke.mjs:20-31. Rust performs ANN over memory_vec + keyword scoring over memory_fts + metadata filtering on memory_chunks, returns top-K scored results.
Parameter feature gap by engine:
| Parameter | Claude SDK tool (memory/tools.mjs:294-324) | Pi direct tool (pi-tools.mjs:178-220) |
|---|
query | ✓ | ✓ |
maxResults (default 5) | ✓ | ✓ |
source (filter) | ✓ | ✗ |
entityId (filter) | ✓ | ✗ |
minImportance (filter) | ✓ | ✗ |
topics (relevance boost) | ✓ | ✗ |
includeConsolidations (1.2× boost on memory_consolidations) | ✓ | ✗ |
includeRelated (1-hop edge traversal via memory_edges, strength > 0.2, 0.2× boost) | ✓ | ✗ |
Pi agents get the simplified 2-param version. The advanced filters are a Claude-only feature today. See Tools for the bridging dedup that drops mcp__orion__knowledge_search on Pi.
Surface 4 — Project / CLI instructions
What it is: markdown files that provide project-level behavioral context. Not vector-indexed — injected verbatim into the context window.
CLAUDE.md (SDK auto-load)
handler.mjs:2727 sets settingSources: ['user', 'project']. The 'project' source causes the SDK to walk up from effectiveCwd (PARA project dir or vault root) looking for CLAUDE.md. The .git/ isolation boundary at the vault root stops the walk from reaching ~/.claude/ (so the user’s personal Claude Code skills and hooks are NOT loaded into Orion runtime).
CLAUDE.md (cron explicit load)
cron/executor.mjs:1085-1185 resolveProjectCwd(projectId) reads Projects/{slug}/CLAUDE.md explicitly when a cron job has payload.sessionMode === 'project'. Contents are appended to the cron system prompt. This is a parallel path to the SDK’s native discovery — cron jobs don’t always run with the right cwd to trigger SDK auto-discovery, so this is the belt-and-braces version.
AGENTS.md / GEMINI.md / QWEN.md (external CLIs)
ccw/cli/executor.mjs:177-190:
Each CLI walks UP from cwd looking for its instruction file: Codex → AGENTS.md, Gemini → GEMINI.md, Qwen → QWEN.md.
These files are placed at the vault root. They are NOT loaded by either Claude SDK or Pi directly — they’re picked up by the external CLIs spawned via delegate_to_cli. Distinct from per-agent AGENTS.md (Surface 2).
Pi engine and CLAUDE.md (unverified)
Pi sessions use noPromptTemplates: true (pi-session.mjs:500) — this disables Pi’s native prompt-template discovery (preventing skill/hook leakage from ~/.pi/, ~/.claude/, project root). But CLAUDE.md isn’t a “prompt template” by Pi’s classification; it’s read by the SDK’s project-context system.
Pi sessions use SettingsManager.inMemory() rather than the SDK session manager, so the SDK’s project-context walk-up likely does not fire for Pi. ⚠️ Unverified — should be tested. Practically: agents that rely on CLAUDE.md context may behave differently on Pi vs Claude. If you need an agent to always see project rules, use Surface 2 (agent MEMORY.md) instead.
Surface 5 — In-conversation history
Claude SDK — sessions persisted to prompts/projects/{conversationId}/ (SDK session JSON). On follow-up turns, handler.mjs passes sdkOptions.continue = true and the SDK replays the full conversation. Survives sidecar restart.
Pi — SettingsManager.inMemory() (pi-session.mjs:519). No disk persistence. If a conversationId is provided to follow-up queries, Pi uses it for context continuity inside the live session, but the actual mechanism is in-memory only. On sidecar restart, all Pi session context is lost. This is the same isolation rationale as Surface 1 — Pi sessions are intentionally short-lived.
Cron-driven memory operations
Five processes interact with the knowledge base. Two run in-process (no LLM), three call out to LLMs.
Memory indexer (in-process, no LLM)
- File:
cron/memory-indexer.mjs
- Schedule: reactive via Vault Observer on file events (1-second per-path debounce) + 6-pass deep-scan interval + safety-net fallback
- Reads: PARA filesystem (
Projects, Areas, Resources, Archive, Inbox)
- Writes: emits
memoryChangesDetected IPC events. Rust does the DB writes.
Enrichment pipeline (Gemini direct)
- File:
cron/enrichment-pipeline.mjs
- Schedule: event-driven (Vault Observer, 1s debounce) + 12h safety-net (
SAFETY_NET_INTERVAL_MS)
- Reads: unenriched
memory_chunks via invoke("memory_get_unenriched")
- Writes: enrichment fields (
summary, extracted_entities, topics, importance, connections) back to memory_chunks. For media chunks: sends file bytes as inlineData to Gemini for description.
- Model:
enrichment_model preference (default google/gemini-2.5-flash). Calls @google/genai SDK directly, not Pi engine or Claude SDK. Non-Google models fall back to gemini-2.5-flash via toGoogleGenAIModel() (enrichment-pipeline.mjs:65).
Consolidation engine (Gemini direct)
- File:
cron/consolidation-engine.mjs
- Schedule: 6-hour timer (
CONSOLIDATION_CADENCE_INTERVAL_MS = 6 * 60 * 60 * 1000)
- Reads: enriched chunks from the last 7 days (
ENRICHED_WINDOW_DAYS = 7). Groups by topic cluster (Jaccard > 0.3). Only consolidates clusters with ≥3 chunks from ≥2 entities (MIN_CLUSTER_SIZE, MIN_CLUSTER_ENTITIES).
- Writes: cross-entity synthesized insights to
memory_consolidations. These surface in knowledge_search results with a 1.2× relevance boost (Claude-side filter).
- Model:
consolidation_model preference (default google/gemini-2.5-flash).
Heartbeat (Pi engine by default)
- File:
cron/heartbeat.mjs
- Schedule: per-entity cadence (Projects and Areas only, per
classifyHeartbeatPath() at :120), gated by _activityMap — only runs if the entity has filesystem activity since the last check (Strategy C, TIGER-3)
- Reads: prior memory via
invoke("memory_search", { query, entityId, source: 'learning' }) to enrich the heartbeat prompt
- Writes: non-suppressed alert text stored as
memory_chunks row with memoryType: 'HEARTBEAT_INSIGHT' via invoke("memory_store", {...}) (heartbeat.mjs:646)
- Model:
heartbeat_model preference (default google/gemini-2.5-flash), routed via Pi engine via executeIsolated → routeModel
Fallback contract: heartbeat.mjs:797-910 runHeartbeatWithFallback(). Triggers on:
/no model found|no auth|api key|unauthorized/i // auth/safety errors
/EMPTY_QUERY_RESULT|safety/i // Gemini safety filter returned no text
When triggered: retry on BACKGROUND_MODEL_FALLBACKS.heartbeat_model (= claude-haiku-4-5-20251001), stripping the effort parameter (Pi-specific). Gated by heartbeat_fallback_enabled preference (default true).
This is the only cron job with documented graceful degradation. Other Pi-routed jobs (enrichment, consolidation) call Gemini directly and surface errors via invoke rather than falling back.
Dream consolidator (v1: in-process, NO LLM)
- File:
cron/dream-consolidator.mjs
- Schedule: nightly via
_armDreamTimer() in cron-service.mjs:545. Timer is always armed, but the dream job is gated by dream_enabled preference (default 'false').
The dream_model user preference is wired but never invoked in v1. constants.mjs:77-78 reads the preference at startup so it’s stable when Phase 4 lands. But dream-consolidator.mjs:17 is explicit: “v1 is DETERMINISTIC: ZERO LLM calls. The dream_model user preference is READ at construction … and stored on the instance as _dreamModelPref for Phase 4 — it is NEVER invoked in v1.” If a user sets the preference today, it has no effect.
What v1 actually does (when enabled):
| Phase | Action |
|---|
| Orient | SQL census, read-only counts |
| Prune | Delete orphan memory_chunks (path not in memory_files), orphan memory_vec rows, stale inferred edges (source='inferred' AND created_at < date('now', '-180 days')), old embedding cache rows (updated_at < date('now', '-30 days')) |
No LLM-based synthesis. That’s Phase 4 (BRS-memory-evolution).
Hook injections
Memory data also reaches the model via hooks fired on every turn, not just via tools.
| Hook | When | What it injects |
|---|
hooks/context/memory-recall.mjs | UserPromptSubmit | FTS5 recall + async enrichment reader (2-min TTL on the enrichment cache) |
hooks/context/para-context-hook.mjs | UserPromptSubmit | Reads memory-enrichment/{conversationId}.json (written by prior turn’s enrichment) and injects into context |
The enrichment cache is a per-conversation file. On every turn, para-context-hook reads any enrichment results computed since the last turn and injects them. ⚠️ Whether these hooks fire for Pi sessions is unverified — the hook pipeline is SDK-level. If Pi bypasses the hook system, agents on Pi routes don’t receive the async enrichment context.
Dual-engine parity matrix
| Surface | Claude SDK | Pi |
|---|
1. SDK auto-memory (MEMORY.md autonomous) | ✓ | ✗ — no equivalent config |
| 2. Agent MEMORY.md (top-level) | ✓ (via resolveTopLevelAgentBody) | ✓ (same shared code path) |
| 2. Agent MEMORY.md (as subagent — D9) | Stripped | Stripped |
| 2. Agent SOUL.md (same rules) | Stripped for subagents | Stripped for subagents |
3. knowledge_search basic (query, maxResults) | ✓ | ✓ |
3. knowledge_search advanced (minImportance, topics, includeConsolidations, includeRelated) | ✓ | ✗ — Pi tool exposes 2 of 7 params |
| 3. Sidecar KB writes | ✓ | ✓ (same Rust IPC bridge) |
| 4. CLAUDE.md auto-discovery | ✓ (via settingSources: ['project']) | ⚠️ Unverified — Pi uses noPromptTemplates: true + in-memory SessionManager |
| 4. AGENTS.md / GEMINI.md / QWEN.md | n/a (external CLIs read these) | n/a |
| 5. Conversation history persistence | ✓ (on disk in prompts/projects/) | ✗ — in-memory only, lost on restart |
| Heartbeat writes to memory | Default Pi route | Default Pi route (same destination) |
| Memory enrichment cache hook | ✓ via para-context-hook | ⚠️ Unverified whether Pi triggers SDK hooks |
Honest gaps
| # | Gap | Severity | Where |
|---|
| M1 | Pi auto-memory. No equivalent of Surface 1 on Pi. Sessions don’t accumulate learnings. | Medium | pi-session.mjs (zero autoMemory* references) |
| M2 | Pi advanced search. knowledge_search on Pi takes 2 params; Claude version takes 7. | Medium | pi-tools.mjs:178-220 |
| M3 | dream_model preference is a UI control with no effect. v1 dream-consolidator does ZERO LLM calls; the preference is read into _dreamModelPref for Phase 4 (not yet implemented). | Low | dream-consolidator.mjs:17, constants.mjs:77-78 |
| M4 | memory_extractor_model preference is a UI control with no service. Service removed in Phase B IMPL-002; constant retained for M1 symmetry + future re-enablement. | Low | constants.mjs:86 |
| M5 | No TTL on memory_chunks. Indexed content grows unbounded. Only orphan / stale-edge / old-cache rows are pruned by dream v1. | Medium | No eviction policy on actively-indexed content |
| M6 | No deduplication for identical content from different paths. A note copied to two places creates two chunks. | Low | Store logic appears to be by path + chunk-id, not content-hash upsert (unverified) |
| M7 | No conflict resolution. If heartbeat and enrichment write contradictory facts about the same entity, both persist. knowledge_search returns by similarity / keyword overlap, no arbitration. | Low | No arbitration layer |
| M8 | CLAUDE.md on Pi unverified. Pi’s noPromptTemplates: true + in-memory SessionManager probably skip the SDK’s project-context walk-up. Untested. | Medium | Verify before relying on CLAUDE.md context in a Pi agent |
| M9 | Enrichment hook injection on Pi unverified. para-context-hook.mjs fires on the SDK UserPromptSubmit event. If Pi bypasses SDK hooks, Pi sessions don’t receive async enrichment. | Medium | Verify hook pipeline coverage |
| M10 | Pi conversation history non-persistent. Sidecar restart loses all Pi session context. Claude SDK survives. | Accepted | pi-session.mjs:519 |
See also
- Agents — the folder-per-agent layout, D7 derived DB index, D9 invariant
- Tools — how
knowledge_search reaches each engine, the bridging dedup
- Two engines —
routeModel, the noPromptTemplates: true isolation rationale
- Background & CLI — cron-engine routing for heartbeat / enrichment / consolidation / dream