metalmind
Memory for Claude Code that costs
zero tool-schema tokens per session, lives in your own
Obsidian vault, and survives every
claude ending.
Every claude invocation is a first meeting. Yesterday's architectural call,
the reason you rejected that library, the 40-minute debug you just finished — gone by
tomorrow. metalmind is the place Claude stores things, and the way Claude gets them back —
without burning context tokens on tool schemas to do it.
~519 standing tokens on a cold session.
mem0 costs ~1,319. Native /memory costs ~1. Measured →
metalmind init. MIT · local-only · no cloud · no accounts · reversible in one command. Claude Code today; Cursor / Codex / Copilot / Gemini on the roadmap.
Scadrial flavor: commands are named after Mistborn Era 1 metals —
copper stores memory, bronze reads the code graph,
iron/steel navigate and rename symbols. Prefer plain verbs?
Switch to in the top-right
— every themed command has a plain alias (tap copper ↔ recall,
burn bronze ↔ graph, etc.) and both always work at the CLI.
Two charts. The whole pitch.
| metalmind | ~519 |
|---|---|
native /memory | ~1 |
| metalmind (stdio MCP) | ~157 |
| mem0 | ~1,319 |
| vault | hit@1 +rerank | hit@5 +rerank |
|---|---|---|
| 12 notes | 90% | 95% |
| 100 notes | 90% | 95% |
| 500 notes | 90% | 95% |
| 1,000 notes | 90% | 95% |
What it's actually for
metalmind pays off when your knowledge lives across more than one repo.
A single-repo user gets a lot from Claude Code's native /memory — text in
CLAUDE.md, free, no moving parts. A multi-repo engineer — same vault across
every project, decisions that outlive any single codebase, code graphs that cross
service boundaries — is who metalmind is built for.
- One vault, every project.
project:frontmatter plus a MOC per project. A decision written in repo A surfaces when youtap copperrecallin repo B if it's topically relevant. NativeCLAUDE.mdis scoped per-project; learnings don't cross-pollinate. - Cross-repo code graph via forge.
burn bronze "<q>" --forge <group>graph "<q>" --group <group>queries every repo in a group. HTTP-route edges connect a caller in one service to a handler in another — Claude native has no concept of "the other service's code." Details on the forge page. - Knowledge that compounds. Each new project starts with every learning
you've documented elsewhere.
Learnings/is intentionally flat — "CLIs should never paste weird package-manager invocations" applies to every repo you work in. With native memory you'd copy-paste the insight into everyCLAUDE.mdseparately. - Decisions that outlive the codebase. Repos get archived, rewritten, replaced. The vault doesn't — plain markdown in your own Obsidian directory, searchable forever. When a project finishes, the learnings stay accessible.
Where native /memory still wins: solo repo, under ~50 notes of
context, no historical lookback needed. Below the break-even, it's simpler and free.
metalmind earns its install cost when you've got more to remember than a single
CLAUDE.md can cheaply hold.
Memory
The headline job. Save once, recall forever, with zero standing tax on Claude's context
window. Recall is a Bash command hitting a loopback HTTP server in your
watcher process — no MCP tool, no per-session schema bloat.
store copper "<insight>"save "<insight>"
deposits a decision into your local Obsidian vault. metalmind proposes the path,
wikilinks, and frontmatter; you approve; it writes. Tomorrow's session recalls it
as if yesterday never ended.
tap copper "<query>"recall "<query>"
is a Bash call, not an MCP tool. We stamp the command into your CLAUDE.md
so Claude reaches for it naturally. --deep follows backlinks;
--expand returns hits plus the surrounding graph; --list-recent N
browses the most recently modified notes without a query. Sub-100ms loopback HTTP on
the hot path; stdio MCP as the always-available fallback.
See the full MCP-tax breakdown and the
recall bench
(incl. qmd column) for the numbers.
scribe <verb>note <verb>
is the CRUD interface agents use instead of raw Write — full
flow: create · update · patch · delete · archive · rename · list · show.
It stamps frontmatter, picks the right folder
(Plans/Learnings/Work/Daily/Inbox/MOCs/Archive), auto-links the project
MOC, and on rename rewrites [[wikilinks]] across the vault. Body on
stdin; every verb supports --dry-run.
~/.claude/CLAUDE.md with explicit WHEN→DO triggers, so every new
claude session discovers the vault on its own — no "did you check
memory?" prompting. Re-stamp anytime with
burn brassstamp
after an upgrade. Doctor's --recall-audit replays the opt-in query log to
surface zero-hit queries — the first memory tool that tells you when it's failing you.
Code intelligence
The vault is half the picture. metalmind also gives Claude eyes on your code — across repos, with route-level edges other tools can't see. Each engine swaps independently; your muscle memory doesn't move.
burn bronze "<q>" --forge <group>graph "<q>" --group <group>
queries every repo in a forge group. HTTP-route-match edges connect caller → handler
across services in three tiers: OpenAPI specs on the metalmind shelf (never
inside your repos), Java RestTemplate/WebClient/Feign callers, and URL literals as an
opt-in fallback. Every inferred edge carries
INFERRED_NAME / INFERRED_ROUTE / INFERRED_URL_LITERAL
provenance so Claude can trust-grade what it reads.
burn iron <symbol>symbol <symbol>
returns a symbol's neighbors — who calls it, what it calls, its module.
burn steel <old> <new>rename <old> <new>
drives a coordinated rename through Serena's LSP backend. Zero-LLM per query. One verb
per concern.
burn zinc "<bug>"debug "<bug>"
hands a bug to the /team-debug skill with the code graph already
primed — the team agents start with the cross-repo context, not cold.
Workflow that compounds back into the vault
The things you'd do anyway — daily planning, big architectural debates — captured back into the vault as plain markdown. Tomorrow's session starts with today's reasoning, not a blank page.
atium new | adddaily new | add
creates or appends to Daily/YYYY-MM-DD.md with checkbox action items.
metalmind routine install eod schedules a launchd agent at 17:30 Mon–Fri
that runs atium new --date next-workday --from <today> and archives
today's note via
goldscribe archive.
Unchecked - [ ] items move forward; - [x] stays behind. No
manual EOD ritual, no hand-rolled cron — the primitives compose through the CLI.
metalmind synod "<question>" spawns 7 expert personas
(Kelsier, Breeze, Sazed, Vin, Clubs, Ham, DocksonAdversary, Strategist, Scientist, Visionary, Engineer, Philosopher, Humanist)
in parallel via Claude Code, then synthesizes a verdict — position, confidence %,
3 critical risks, 5 next steps, minority report from the strongest dissenter. Pulls
vault context first; proposes persisting the verdict via
scribenote
after. The deliberation goes back into the vault so the next time the
question surfaces, you don't re-debate it. For decisions that affect the next 6
months, not the next 60 minutes.
metalmind uninstall stops containers, unloads the watcher service
(launchd on macOS, systemd on Linux), restores your prior output style, clears the
settings we changed, removes shell aliases. Never touches your notes.
Everything we did is undoable in one command — a design constraint, not an
afterthought.
~/.claude/ metalmind init doesn't just give you a CLI — it installs a curated layer of
Claude Code skills and slash commands tuned for the vault. Each one teaches Claude
when and how to reach for the memory below it.
~/.claude/skills/
save capture the session's key insight into the vault
synod 7-persona deliberation on architecture-class decisions
writing-vault-notes teach Claude vault conventions (OFM, scribe, MOCs)
using-teams coordinate multi-agent teams (TeamCreate / SendMessage)
obsidian-markdown Obsidian-flavoured-markdown reference for note bodies
~/.claude/commands/
/save alias of the save skill — one-tap end-of-session capture
~/.claude/commands/ opt-in via metalmind init --with-teams
/team-feature architect-led team to design + implement a feature
/team-debug 3–5 adversaries debate a bug via competing hypotheses
/team-pr-review 4 reviewers parallel-audit a PR (sec / api / perf / conv)
/team-multi-repo-audit one reviewer per repo, parallel cross-repo audit
All of them are installed via sentinel-bounded blocks and removed cleanly by
metalmind uninstall — your existing skills and commands are untouched.
Re-stamp after an upgrade with
burn brassstamp.
How metalmind compares
Five "memory for AI" tools, side-by-side on the dimensions that actually differ. Recall numbers on a fixed corpus are above; this is the shape comparison — what each tool stores, how the agent reaches it, and where your data lives.
| Dimension | metalmind | qmd | mem0 | Letta | Mastra |
|---|---|---|---|---|---|
| Memory primitive | Markdown chunks (file-mapped) | Markdown chunks (file-mapped) | LLM-extracted facts | Managed memory blocks | Threads + working memory |
| Source preservation | Yes — notes intact | Yes — files intact | No — facts replace docs | Partial — buffer + summaries | Yes — message history |
| Recall determinism | Deterministic | Deterministic | LLM extracts on every write | LLM mediates updates | Deterministic + LLM summary |
| How agents call it | Bash → loopback HTTP | Bash CLI (MCP optional) | MCP server or SDK | HTTP server (own runtime) | TS framework API |
| Standing tokens in Claude Code | ~519 (prose in CLAUDE.md) | ~0 (CLI; MCP opt-in) | ~1,319 (3 MCP schemas) | n/a — different host model | n/a — different host model |
| Where state lives | Local Obsidian vault + sqlite-vec | Local files + sqlite | Cloud or self-hosted vector DB | Cloud or self-hosted Letta server | Pluggable (pgvector / cloud) |
| Walk-away cost | Zero — vault is plain markdown | Zero — files stay readable | Export needed; facts ≠ docs | Export needed; data lives in Letta | Depends on chosen store |
Letta and Mastra are agent frameworks with built-in memory — a different category from metalmind, qmd, and mem0 (memory tools you bolt onto an existing host). Listed anyway because they surface in "memory for AI" searches; "different host model" rows are honest about the apples-vs-oranges shape, not a metalmind win.
Pick metalmind when your notes are the source of truth and you want Claude Code to read them verbatim with zero standing tax. Pick mem0 for fact extraction from conversations. Pick Letta or Mastra when you're building agents inside their framework. Pick qmd if you want the same shape on a non-Claude host.
Why it isn't an MCP server
Most memory tools register themselves as MCP servers. That design injects a handful of
tool schemas (search_memory, recall, store, …)
into every Claude session before you prompt anything. Those schemas eat
context tokens you could be using for the actual task.
metalmind takes the opposite bet: the recall surface is a CLI, Claude learns the command
once from your stamped CLAUDE.md, and every session starts with a clean
context. The watcher, indexer, and embedding stack still live locally — they just don't
live in Claude's tool registry.
Measured in bench/mcp-tax-v0/ — first-turn token tax on a cold session, before any user turn:
| System | Transport | Tools | First-turn tokens |
|---|---|---|---|
| metalmind (default) | loopback HTTP | 0 | ~519 (one-time CLAUDE.md block) |
Claude Code native /memory | CLAUDE.md text | 0 | ~1 |
| metalmind (stdio MCP fallback) | MCP stdio | 3 | ~157 |
mem0 (pinkpixel-dev/mem0-mcp) | MCP stdio | 3 | ~1,319 |
~2.5× lower than mem0 as shipped (loopback-HTTP vs stdio MCP) · ~8.4× lower on the apples-to-apples MCP comparison (metalmind's stdio fallback vs mem0 — same transport, different schema discipline).
The ~519 tokens metalmind spends up front are prose in ~/.claude/CLAUDE.md
that teaches Claude when to recall — work that mem0's schema-tax doesn't do.
Approximation via chars / 4; rerun with pnpm bench:mcp-tax
(add ANTHROPIC_API_KEY for exact counts). Shape is what matters: two
zero-schema transports, one modest, one bloat. Per-call result tokens are billed like
any tool output and are excluded here — this is standing cost before a single
call.
Who should NOT use metalmind
Honest anti-personas — install the wrong tool and you'll bounce in an hour:
- You don't use Claude Code. SessionStart hook, stamped
CLAUDE.md, MCP fallback — all target Claude Code specifically. Cursor/Codex/Copilot/Gemini are roadmap. - You don't use Obsidian. The vault is the storage layer. No other UI is planned.
- You want a 30-second install. The wizard takes a few minutes — Python prereqs, embed-model download (~30 MB), first-index. Worth it for daily users; overkill for evaluation.
- You're a team of 5+ with shared memory needs. metalmind is single-dev by design. The forge supports many repos per dev; it does not sync vaults between devs.
Will this still be around?
Fair question for any solo-maintainer tool. The sustainability story:
- Your notes outlive metalmind. The vault is plain markdown in your own
~/Knowledge/directory. If this project goes unmaintained tomorrow, you keep everything — Obsidian still opens the files,grepstill searches them,gitstill versions them. metalmind is the layer that makes Claude use them well, not the layer that holds them hostage. - No cloud, no accounts, no phone-home. Embeddings, indexing, recall, code graphs — all local. There is no metalmind backend to shut down, no API quota to throttle, no subscription to lapse. The only network call is the one you were already making to Claude.
- Reversible in one command.
metalmind uninstallstops the watcher, strips the sentinel-bounded blocks from yourCLAUDE.mdfiles, and clears shell aliases. Your vault is never touched. Try it — then reinstall if you like it. - MIT licensed. Fork it, vendor it, swap the embedding backend. Architecture decisions are documented in
docs/,bench/, and CHANGELOG specifically so a contributor — or a future-you — can keep it running.
Commands
| Metal | Command | Description |
|---|---|---|
| Copper ↓ | $ metalmind store copper "..." $ metalmind save "..." | Deposit an insight to the vault. |
| Copper ↑ | $ metalmind tap copper "<q>" $ metalmind recall "<q>" | Retrieve semantically (--deep / --expand). |
| Bronze | $ metalmind burn bronze "<q>" $ metalmind graph "<q>" | Query the code graph (Seeker). |
| Iron | $ metalmind burn iron "<sym>" $ metalmind symbol "<sym>" | Pull a symbol and its neighbors. |
| Steel | $ metalmind burn steel <o> <n> $ metalmind rename <o> <n> | Coordinated rename via Serena. |
| Tin | $ metalmind burn tin $ metalmind verbose | Enhanced output — verbose toggle. |
| Pewter | $ metalmind burn pewter $ metalmind reindex | Force-rebuild the code graph. |
| Zinc | $ metalmind burn zinc "<bug>" $ metalmind debug "<bug>" | Rioter — dispatch a team-debug session. |
| Aluminum | $ metalmind burn aluminum $ metalmind uninstall | Reversible teardown (wipes install). |
| Brass | $ metalmind burn brass $ metalmind stamp | Soother — re-imprint metalmind managed files (upgrade in place). |
| Atium | $ metalmind atium new | add $ metalmind daily new | add | Future daily notes — --date today|tomorrow|next-workday|YYYY-MM-DD, --from carries unchecked items. |
| Gold | $ metalmind gold <note> $ metalmind scribe archive <note> | Burn gold — archive a note (move to Archive/). |
| Flare | $ metalmind flare banner | dialog | sticky $ metalmind notify banner | dialog | sticky | macOS desktop notifications — amplify a signal. |
| Routine | $ metalmind routine install eod $ metalmind routine install eod | Schedule EOD carry-forward + archive — launchd agent 17:30 Mon–Fri (macOS). |
| Forge | $ metalmind forge create <g> $ metalmind group create <g> | Define a cross-repo group. |
| Forge | $ metalmind forge add <g> <r> $ metalmind group add <g> <r> | Add a repo to the group. |
| Forge | $ metalmind forge capture-spec <r> <url> $ metalmind group capture-spec <r> <url> | Seed the OpenAPI shelf for cross-repo route edges. |
| Forge | $ metalmind burn bronze "<q>" --forge <g> $ metalmind graph "<q>" --group <g> | Query across every repo in the group (--include-literals for URL-literal fallback). |
| Scribe | $ metalmind scribe <verb> $ metalmind note <verb> | Vault CRUD — create · update · patch · delete · archive · rename · list · show. |
| Release | $ metalmind release-check $ metalmind release-check | Preflight before tagging — working tree, branch, version sync, tests, build, stamped block. |
| Seeker | $ metalmind pulse $ metalmind doctor | Pulse-check the install — prereqs, config, MCP state. |
Both spellings always work at the CLI. Use the Scadrial / Classic toggle in the nav to re-spell the page.
Install flow
A single metalmind init drives the whole install — and every step is
reversible via metalmind uninstall, which never touches your notes.
Scroll the rail below to trace what happens.
-
Prereqs, detected
Claude Code, Python 3.11+, uv, git — each checked with a per-failure remediation. Failing prereqs halt the wizard with a concrete fix. Docker is only required when you opt into --legacy.
◆ Claude Code 2.1.114 ◆ Python 3.11+ via python3.12 ◆ uv 0.11.7 ◆ git 2.53.0 -
Vault scaffold
Picks or creates your Obsidian vault. Drops in Work / Personal / Learnings / Daily / Inbox / Archive / Memory, stamps a CLAUDE.md that teaches Claude to recall via CLI — not MCP tools.
◇ Setting up vault ◆ Vault at ~/Knowledge created: Work, Personal, Learnings, Daily, Inbox, Archive, Memory -
Engines installed
Three Python tools land on PATH via uv — Serena (LSP navigation), graphify (code graph), vault-rag (recall server + watcher + indexer + doctor). Mirror of how you install anything else.
◇ Installing Serena → uv tool install serena-agent ◇ Installing graphify → uv tool install graphifyy ◇ Installing vault-rag → uv tool install metalmind-vault-rag -
In-process retrieval stack
sqlite-vec for vectors and fastembed (BAAI/bge-small-en-v1.5) for embeddings live inside the watcher process — no Docker, no daemon. The 30 MB ONNX model downloads once on first index. Pass --legacy to opt into the older Qdrant + Ollama Docker stack instead.
◇ Embedded backend (sqlite-vec + fastembed) — no Docker stack needed ◆ Python tool venv at ~/.local/share/uv/tools/metalmind-vault-rag model cache at ~/.cache/fastembed/ -
Services, hook & routing
Auto-reindex watcher runs as a background service — launchd on macOS, systemd --user on Linux. MCP registers Serena only; vault recall is a CLI call (no schema tokens). A SessionStart hook injects a memory-available reminder so fresh Claude sessions discover the vault without prompting. Optional: disable native auto-memory and route all memory to the vault.
◇ Installing watcher service ◆ wrote com.metalmind.vault-indexer.plist ◇ Installing SessionStart hook ◆ registered in settings.json ◇ Applying memory routing ◆ disabled native auto-memory -
Ready
metalmind pulse verifies the whole chain end-to-end. metalmind uninstall rolls everything back — watcher, MCP entries, settings, aliases (and Docker containers if you used --legacy) — without ever touching your notes.
$ metalmind pulse ◇ Prerequisites ◇ Config ● flavor: scadrial ● vaultPath: ~/Knowledge ● mcp: serena, graphify └ All systems nominal.
Under the metalmind Under the hood
One verb, one job. Each engine is swappable — so if a backend gets replaced later, your muscle memory doesn't move. Your notes, embeddings, and code graphs never leave your machine.
~/Knowledge/. sqlite-vec + fastembed
(BAAI/bge-small-en-v1.5) run in-process — no Docker, no daemon.
Incremental watcher re-embeds only changed files; queries stay answerable
throughout reindex.
INFERRED_NAME / INFERRED_ROUTE provenance.