Karpathy's Obsidian RAG

How "compiled knowledge" beats "retrieval from scratch" — and how to apply it to Claude Code, Bob, and every AI agent you use

RESEARCH REPORT — APRIL 2026
3
Memory Paradigms Compared
6
Implementation Paths
12+
Tools Evaluated

1. The Core Idea — Compiled Knowledge, Not Retrieval

In early 2026, Andrej Karpathy published a gist describing a deceptively simple pattern: instead of uploading documents and asking an LLM to retrieve chunks at query time (traditional RAG), you have the LLM build and maintain a persistent wiki — a structured, interlinked markdown knowledge base that compounds over time.

The Fundamental Shift

TRADITIONAL RAG                    KARPATHY'S LLM WIKI
============                       ================

User uploads docs                  User drops source into raw/
        |                                  |
   Chunks stored                    LLM reads, extracts, synthesizes
   in vector DB                             |
        |                           Wiki pages created/updated
Query → Retrieve → Generate         Cross-references maintained
        |                           Contradictions flagged
  Rediscover every time             Index + log updated
  Nothing compounds                         |
                                    Knowledge COMPOUNDS forever

Karpathy's key insight: "The wiki is a persistent, compounding artifact. The cross-references are already there. The contradictions have already been flagged. The synthesis already reflects everything you've read."

Why This Matters for AI Agents

Traditional RAG is amnesiac. Every query starts from zero — re-chunk, re-embed, re-retrieve, re-synthesize. The LLM Wiki compiles knowledge once, then keeps it current. When you ask a question, the answer is already half-synthesized in the wiki. The LLM just reads the relevant pages and connects the dots.

Compounding Cross-referenced Human-curated Zero infra

2. The Three-Layer Architecture

wiki/
├── SCHEMA.md            # Conventions, rules, tag taxonomy
├── index.md             # Content catalog (one-line summaries)
├── log.md               # Chronological action log (append-only)
├── raw/                 # LAYER 1: Immutable source material
│   ├── articles/
│   ├── papers/
│   └── transcripts/
├── entities/            # LAYER 2: Wiki pages (LLM-owned)
│   ├── anthropic.md
│   └── karpathy.md
├── concepts/
│   ├── transformer-arch.md
│   └── rag-vs-wiki.md
├── comparisons/
└── queries/

Layer 1: Raw Sources

Immutable. The LLM reads but never modifies. Your source of truth. Includes SHA256 hashes so re-ingests can detect content drift.

ImmutableHashed

Layer 2: The Wiki

LLM-owned markdown files. Entity pages, concept pages, comparisons. Every page has YAML frontmatter, wikilinks, and provenance markers.

LLM-maintainedWikilinks

Layer 3: The Schema

The configuration file (like CLAUDE.md or AGENTS.md) that tells the LLM how to maintain the wiki. Defines conventions, page thresholds, tag taxonomy, update policies. This is what makes the LLM a disciplined wiki maintainer instead of a generic chatbot.

CriticalCo-evolved

3. The Three Operations

📥 Ingest
Read source → Extract → Update wiki
❓ Query
Search wiki → Read pages → Synthesize
🔍 Lint
Health check → Find gaps → Fix issues

📥 Ingest — Knowledge Accumulation

You drop a source (URL, PDF, paste). The LLM reads it, extracts key info, creates/updates wiki pages, adds cross-references, flags contradictions, updates the index and log. A single source might touch 10-15 wiki pages. The human stays in the loop — reads summaries, guides emphasis.

❓ Query — Knowledge Retrieval + Filing

The LLM reads the index, finds relevant pages, synthesizes an answer with citations. Key insight: good answers get filed back as new wiki pages. A comparison you asked for, a connection you discovered — these compound into the knowledge base instead of disappearing into chat history.

🔍 Lint — Knowledge Maintenance

Periodic health checks: orphan pages, broken wikilinks, stale content, contradictions between pages, missing cross-references, tag audit. The wiki stays healthy because the LLM does the maintenance that no human wants to do.

4. How AI Coding Agents Handle Memory Today

Agent Memory Mechanism Type Cross-Session Compounds?
Claude Code CLAUDE.md files + auto memory Instructions + learned notes ✅ Yes ⚠️ Partially
Cursor .cursorrules + project context Rules file ✅ Yes ❌ No
GitHub Copilot .github/copilot-instructions.md Instructions file ✅ Yes ❌ No
Aider .aider.conf + convention files Config + auto-discovered ✅ Yes ❌ No
Codex CLI AGENTS.md Instructions file ✅ Yes ❌ No
Hermes (Bob) MEMORY.md + skills + session search Structured memory + skills ✅ Yes ⚠️ Partially
⚠️ The Common Problem: All current systems are instruction-based, not knowledge-based. They tell the agent how to behave, not what it knows. None of them build a compounding knowledge artifact. Knowledge lives in chat history, gets compressed away, and must be rediscovered.

5. The Knowledge Gap — What's Missing

What CLAUDE.md Does Well

  • Persistent instructions across sessions
  • Project architecture context
  • Coding standards and conventions
  • Build/test commands
  • User preferences (auto memory)

What CLAUDE.md Can't Do

  • Accumulate research findings over weeks
  • Track relationships between concepts
  • Flag contradictions between sources
  • Synthesize across 20+ documents
  • Answer "what do I know about X?"
  • Build on past discoveries
💡 The Insight: CLAUDE.md is like a style guide. Karpathy's LLM Wiki is like a brain. You need both — instructions for how to behave, and knowledge for what you know. They're complementary, not competing.

6. Applying LLM Wiki to Claude Code

Here are three concrete ways to integrate Karpathy's approach with Claude Code, from simplest to most ambitious:

Path A: Project Knowledge Wiki (Simplest)

Add a knowledge/ directory alongside your CLAUDE.md. Instruct Claude via CLAUDE.md to maintain it as a wiki. When you research something, ask Claude to file the findings. When Claude discovers something useful (a tricky API, a library quirk), it files that too.

my-project/
├── CLAUDE.md          # Instructions
├── knowledge/         # ← The wiki
│   ├── SCHEMA.md
│   ├── index.md
│   ├── api-quirks/
│   ├── architecture/
│   └── debugging/
├── src/
└── tests/
Start today Zero tooling Manual orchestration

Path B: Team Wiki via Git (Medium)

A shared wiki/ repo that the whole team's Claude Code instances contribute to. Add it as a git submodule or symlink. The schema defines who can write what. Claude instances across the team ingest sources, update entity pages, and maintain cross-references. Git handles version history and conflict resolution.

team-wiki/
├── SCHEMA.md
├── index.md
├── raw/
├── entities/
├── concepts/
└── .git/

# In each dev's CLAUDE.md:
# "Also read and maintain the wiki at ../team-wiki/"
Team scale Git versioned Schema-governed

Path C: MCP-Powered Wiki (Full Power)

Build or use an MCP server that exposes wiki operations as native tools. Claude Code calls wiki_ingest, wiki_query, wiki_lint as tool calls — no shell commands, no file path gymnastics. The MCP server handles indexing, search, and consistency checks internally.

# Claude Code settings
{
  "mcpServers": {
    "wiki": {
      "command": "npx",
      "args": ["-y", "llm-wiki-mcp-server", "~/wiki"]
    }
  }
}

# Now Claude can natively:
# - wiki_ingest(url) -- process and integrate a source
# - wiki_query(question) -- search and synthesize
# - wiki_lint() -- health check the wiki
Full integration Native tools Build required

7. Applying LLM Wiki to Bob (Hermes Agent)

Bob already has several memory primitives that map surprisingly well to the LLM Wiki pattern:

Wiki Concept Bob Already Has What's Missing
Raw sources ✅ session_search, web tools No immutable raw/ archive
Wiki pages ⚠️ MEMORY.md (monolithic) Structured entity/concept pages
Schema ✅ AGENTS.md + SOUL.md No tag taxonomy or page thresholds
Index ❌ None No searchable content catalog
Log ⚠️ Cron logs exist No chronological wiki action log
Cross-references ❌ None No wikilinks between memory entries
Lint ❌ None No health checks on knowledge

Proposed: Bob's Knowledge Wiki

~/.hermes/
├── MEMORY.md              # Keep: lightweight facts (current)
├── wiki/                  # NEW: compounding knowledge base
│   ├── SCHEMA.md          # Domain config, tag taxonomy
│   ├── index.md           # Content catalog
│   ├── log.md             # Action log
│   ├── raw/
│   │   ├── articles/      # Web research results
│   │   └── transcripts/   # Chat transcripts worth keeping
│   ├── entities/          # People, services, tools
│   │   ├── thota.md
│   │   ├── cloudflare.md
│   │   └── hermes-agent.md
│   ├── concepts/          # Technical concepts
│   │   ├── rag-patterns.md
│   │   ├── mcp-protocol.md
│   │   └── cron-architecture.md
│   ├── projects/          # Project-specific knowledge
│   │   ├── vps-setup.md
│   │   └── telegram-bot.md
│   └── skills-learned/    # Things Bob figured out
│       ├── cloudflare-deploy.md
│       └── heredoc-tricks.md
💡 Implementation Strategy: Don't replace MEMORY.md — augment it. MEMORY.md stays as the fast, lightweight facts file (injected every turn). The wiki is the deep knowledge base that Bob reads on-demand when answering research questions or working on complex projects. The llm-wiki skill already implements this pattern for Bob.

8. Tools and Ecosystem

📝 llm-wiki-compiler (by atomicmemory)

A Node.js CLI that compiles raw sources into an interlinked markdown wiki. Supports Anthropic, OpenAI, and Ollama backends. Obsidian-compatible out of the box. Inspired by Karpathy's pattern.

npm install -g llm-wiki-compiler
export ANTHROPIC_API_KEY=***
llmwiki ingest https://some-article.com
llmwiki compile
llmwiki query "what is X?"
Ready to use npm Multi-provider

🔍 qmd (by tobi)

Local search engine for markdown files with hybrid BM25/vector search and LLM re-ranking. Has both a CLI (for agent shell commands) and an MCP server (for native tool integration). Perfect for scaling the wiki beyond the index file.

Hybrid search Local-first MCP server

🧠 MCP Memory Server (official)

Anthropic's reference MCP server for knowledge-graph-based persistent memory. Stores entities, relations, and observations as a graph. Good for structured knowledge but lacks the wiki's narrative synthesis.

npx -y @modelcontextprotocol/server-memory
Official Knowledge graph Structured only

🔗 Obsidian Integration

The wiki directory works as an Obsidian vault out of the box. Graph View visualizes connections. Dataview plugin enables queries over frontmatter. Web Clipper captures articles to raw/. For headless servers, use obsidian-headless for Obsidian Sync without a GUI.

Visualization Native wikilinks Graph view

📚 Hermes llm-wiki Skill (Already Built)

Bob already has the llm-wiki skill — a full implementation of Karpathy's pattern with schema, index, log, three-layer architecture, ingest/query/lint operations, provenance markers, and frontmatter validation. It's ready to use right now.

Ready Already in Bob Full implementation

9. Memory Approaches — Head to Head

Feature Traditional RAG CLAUDE.md / Rules MCP Memory (Graph) Karpathy's LLM Wiki
Knowledge persists ⚠️ In vector DB ✅ As instructions ✅ As entities ✅ As wiki pages
Compounds over time ❌ No ❌ No ⚠️ Additive only ✅ Synthesis + cross-refs
Handles contradictions ❌ No ❌ No ❌ No ✅ Flags + tracks
Human-readable ❌ Chunks in DB ✅ Markdown ⚠️ JSON graph ✅ Full wiki
Works in Obsidian ❌ No ⚠️ Single file ❌ No ✅ Full vault
Zero infrastructure ❌ Vector DB needed ✅ Just files ✅ Just MCP server ✅ Just files
Search quality ✅ Embeddings ❌ Context window ⚠️ Graph traversal ⚠️ Index + grep (or qmd)
Scales to 1000+ docs ✅ Yes ❌ Context limited ⚠️ Moderate ⚠️ Needs search tool
Cross-source synthesis ❌ Per-query ❌ Not designed for it ⚠️ Manual ✅ Core feature
Agent maintenance burden ✅ None ✅ Minimal ⚠️ Moderate ⚠️ Significant (but automated)
💡 The Verdict: No single approach wins everywhere. The sweet spot is combining them: CLAUDE.md for instructions, LLM Wiki for knowledge, MCP Memory for structured entity relationships, and traditional RAG only when you need to search across 1000+ raw documents the wiki hasn't ingested yet.

10. Getting Started — Three Practical Steps

Step 1: Start a Wiki (10 minutes)

Create a wiki/ directory. Add a SCHEMA.md with your domain and conventions. Add an index.md and log.md. Drop your first source into raw/ and ask your LLM agent to ingest it. That's it — you're running the pattern.

mkdir ~/wiki/{raw/{articles,papers},entities,concepts,comparisons,queries}
# Write SCHEMA.md with your domain
# Drop a source: cp article.md ~/wiki/raw/articles/
# Ask your agent: "Ingest ~/wiki/raw/articles/article.md into the wiki"

Step 2: Add Search (when you need it)

When the index file isn't enough (50+ pages), add qmd for hybrid search. Or just use grep — it works surprisingly well on structured markdown with consistent frontmatter.

# Simple search that works today
grep -r "transformer" ~/wiki/entities/ ~/wiki/concepts/

# Or install qmd for proper search
npm install -g @anthropic/qmd
qmd search "transformer architecture" ~/wiki/

Step 3: Connect to Your Agent

Point your agent's CLAUDE.md (or AGENTS.md) at the wiki. Define ingest, query, and lint workflows. The agent reads the schema at session start and follows the conventions. Knowledge compounds from day one.

# In CLAUDE.md or AGENTS.md:
## Knowledge Wiki
Maintain a knowledge wiki at ~/wiki/.
- Read SCHEMA.md and index.md at session start
- When researching, file findings as wiki pages
- When learning something useful, add to concepts/
- Run lint weekly to check for stale content

11. The Memex Connection

🏛️ Vannevar Bush's Memex (1945)

Karpathy's pattern is spiritually connected to Vannevar Bush's Memex — a 1945 vision of a personal, curated knowledge store with associative trails between documents. Bush imagined something private, actively curated, with connections between documents as valuable as the documents themselves.

The part Bush couldn't solve: who does the maintenance? Humans abandon wikis because maintenance burden grows faster than value. LLMs don't get bored, don't forget to update a cross-reference, and can touch 15 files in one pass. The wiki stays maintained because the cost of maintenance is near zero.

Historical context Memex → Wiki → LLM Wiki

Generated by Bob (Hermes Agent) — April 2026

Sources: Karpathy's LLM Wiki GistClaude Code Memory DocsMCP Serversllm-wiki-compilerqmd