claude-mem vs auto memory: do you need a memory plugin for Claude Code?

Claude Code has a memory problem — or rather, it had one. Every session starts fresh. The architectural decisions you explained yesterday, the debugging rabbit hole you went down last week, the build commands you corrected three times — all gone. Your next session begins with Claude reading your codebase like a stranger.

Two solutions now compete for this space. On February 5, Anthropic shipped auto memory as a built-in feature: a lightweight, markdown-based system where Claude jots down notes for itself between sessions. Meanwhile, a community plugin called claude-mem has been attacking the same problem since August 2025 with a far heavier approach — SQLite databases, vector embeddings, AI-powered compression, and a web UI. It hit #1 trending on GitHub in early February with over 28,000 stars.

The question isn’t which is “better” in the abstract. It’s which one matches how you actually work.

What auto memory does

Auto memory is deliberately simple. When Claude notices something worth remembering — a build command, a testing convention, a debugging pattern — it writes a note to a markdown file at ~/.claude/projects/<project>/memory/MEMORY.md. The first 200 lines of that file get injected into the system prompt at the start of every future session.

For anything that needs more detail, Claude creates topic files alongside it:

~/.claude/projects/<project>/memory/
├── MEMORY.md          # Index — loaded every session (200 lines max)
├── debugging.md       # Detailed notes, loaded on demand
├── api-conventions.md
└── ...

Topic files don’t load at startup. Claude reads them during a session when it decides it needs the information — the same way it reads any other file.

The system is bidirectional. Claude writes notes automatically, and you can tell it what to remember: “remember that we use pnpm, not npm” or “save to memory that the staging API requires a VPN.” You can also edit the files directly or use /memory to open them in your editor.

Configuration is a single environment variable:

export CLAUDE_CODE_DISABLE_AUTO_MEMORY=1  # Force off
export CLAUDE_CODE_DISABLE_AUTO_MEMORY=0  # Force on

That’s the entire system. No databases, no background services, no dependencies. Markdown files that Claude reads and writes with its standard file tools.

What claude-mem does

claude-mem takes a fundamentally different approach. Rather than having Claude decide what seems worth remembering, it captures everything and compresses it after the fact.

Five lifecycle hooks intercept session events in real time: SessionStart, UserPromptSubmit, PostToolUse, Stop, and SessionEnd. Every tool call, every user prompt, every session boundary gets captured. A separate AI call (via the Claude Agent SDK) then analyses the raw data and generates structured “observations” — each classified as a decision, bugfix, feature, refactor, discovery, or change.

These observations are stored in a SQLite database with full-text search (FTS5) and optionally in a ChromaDB vector database for semantic search. The storage lives at ~/.claude-mem/ and includes:

Typed observations with titles, subtitles, facts, concepts, file lists, and narrative descriptions
Session summaries covering what was requested, investigated, learned, completed, and what comes next
Raw transcript events for the full conversation history
ROI metrics tracking how many tokens each observation cost to discover

At the start of a new session, the SessionStart hook retrieves relevant observations and injects them into Claude’s context. A “progressive disclosure” system keeps this efficient: the initial injection is a lightweight index (~50–100 tokens per result), and Claude can drill into timelines and full details on demand. The project claims ~10x token savings over loading everything upfront.

There’s also a web viewer at localhost:37777 with real-time streaming, search, and filtering — useful for browsing what Claude has learned without opening the database directly.

Where they actually differ

The architectural gap between these two systems is wide, but what matters is how that gap translates into practical differences.

What gets captured. Auto memory records what Claude thinks is important — patterns, preferences, insights. It’s selective by design. claude-mem captures every tool call and compresses the full session history. The difference matters most on long-running projects: auto memory might note “the auth module uses JWT with refresh tokens,” while claude-mem would preserve the entire debugging session where you discovered a token expiry edge case, including which files you examined and what you tried.

Search. Auto memory has none. Claude reads MEMORY.md linearly and decides whether to open topic files. claude-mem offers full-text keyword search (SQLite FTS5) and semantic vector search (ChromaDB), exposed as five MCP tools that Claude can call programmatically. For a project with weeks of accumulated context, the difference between “read the index and hope” and “search for that authentication bug from last Tuesday” is significant.

Token efficiency. Auto memory loads the first 200 lines of MEMORY.md into every session, regardless of relevance. claude-mem’s progressive disclosure loads a lightweight index first and expands on demand. In theory, claude-mem is more efficient at scale. In practice, 200 lines of well-maintained markdown is often enough, and the overhead of claude-mem’s compression pipeline (which itself makes API calls) adds cost that doesn’t show up in the token count.

Setup and dependencies. Auto memory requires nothing — it ships with Claude Code. claude-mem requires Bun, Node.js 18+, uv (a Python package manager for ChromaDB), and runs a background worker service. The install is a plugin marketplace command, but the dependency chain is real.

Ongoing cost. Auto memory is free. claude-mem’s compression pipeline uses the Claude Agent SDK, which means every session incurs additional API calls beyond your own usage. The project doesn’t publish cost estimates, and the amount varies with session length and activity.

Team sharing. Auto memory files are local and per-user, but CLAUDE.md files (the manual complement to auto memory) can be committed to version control. claude-mem’s database is strictly local — there’s no mechanism for sharing observations across a team.

The stability question

claude-mem is iterating fast. Between February 11 and 14, seven releases shipped (v10.0.1 through v10.0.7), with one release being a full revert of the previous one. The open issue tracker tells a story of a project pushing hard on features while stability chases behind:

Issue #1090: The worker spawns hundreds of leaked Claude CLI processes, consuming API tokens uncontrollably
Issue #1089: Worker daemon subprocesses never terminate, causing memory leaks
An earlier version triggered a “chroma-mcp spawn storm” — 641 Python processes in five minutes, consuming 75%+ CPU and ~64GB of virtual memory

These are the kind of bugs that erode trust in a tool that’s supposed to run silently in the background. The fix-revert-fix cycle on the ChromaDB integration suggests the complexity of managing multiple database backends, background workers, and lifecycle hooks is genuinely difficult to get right.

There’s also the $CMEM Solana token promoted at the top of the README, with links to DEXScreener and Jupiter. It doesn’t affect the code, but it’s worth noting that the project has a crypto monetisation angle alongside the AGPL-3.0 license.

Auto memory, by contrast, is a first-party feature maintained by the Claude Code team. It’s less ambitious but ships inside a product with a dedicated QA process and a vested interest in not breaking your workflow.

What the community thinks

Discussions on Hacker News and GitHub reveal three distinct camps.

The minimalists argue that a well-crafted CLAUDE.md file handles 80–90% of the memory problem with zero dependencies. As one HN commenter put it: “why do I need an API key for what can be local markdown files?” Several developers described elaborate setups using nothing but version-controlled markdown — handoff documents, feature implementation plans, engineering context files — that give Claude everything it needs without any automated memory system at all.

The pragmatists use auto memory but want it to be better. GitHub issue #23750 captures the frustration: auto memory creates files users didn’t know existed, the UI doesn’t clarify which file is being accessed, and there’s no granular way to disable it without also disabling CLAUDE.md loading. Issue #24044 reports that MEMORY.md gets loaded twice — by both the auto memory loader and the CLAUDE.md loader — wasting ~3KB per API call.

The power users report real productivity gains from tools like claude-mem. One developer measured task completion times dropping from 10–11 minutes to 1–2 minutes with context injection. The argument is that for multi-week projects with complex state, the ability to semantically search past sessions is transformative. These developers tend to work on projects where Claude is making hundreds of decisions per week, and the cost of re-explaining context outweighs the cost of running a memory plugin.

The split roughly maps to project complexity. Solo developers working on shorter-lived tasks lean toward simplicity. Teams running sustained development across many sessions lean toward structure.

When to use what

Start with auto memory and CLAUDE.md. For most projects, this is enough. Write your project conventions, build commands, and architectural decisions in CLAUDE.md. Let auto memory handle the session-to-session continuity of patterns and preferences. Edit the memory files when they drift. This costs nothing, adds no dependencies, and works out of the box.

Consider claude-mem when sessions blur together. If you’re working on the same project for weeks, making hundreds of decisions Claude keeps forgetting, and you find yourself re-explaining context that should have persisted — that’s when structured memory with search starts paying for itself. The setup cost and stability risks are real, but so is the cost of repeating yourself for the fifth time.

Consider neither if you prefer full control. Several experienced developers skip automated memory entirely in favour of version-controlled documentation: handoff files, architecture decision records, feature specs. Claude reads these like any other file. The upfront effort is higher, but you know exactly what’s in context and there are no surprises.

The built-in system will likely improve — Anthropic has clear incentive to make auto memory good enough that plugins aren’t necessary. But today, there’s a genuine gap between what auto memory offers and what power users need. Whether that gap matters depends entirely on how you work.