How "compiled knowledge" beats "retrieval from scratch" — and how to apply it to Claude Code, Bob, and every AI agent you use
RESEARCH REPORT — APRIL 2026In early 2026, Andrej Karpathy published a gist describing a deceptively simple pattern: instead of uploading documents and asking an LLM to retrieve chunks at query time (traditional RAG), you have the LLM build and maintain a persistent wiki — a structured, interlinked markdown knowledge base that compounds over time.
TRADITIONAL RAG KARPATHY'S LLM WIKI
============ ================
User uploads docs User drops source into raw/
| |
Chunks stored LLM reads, extracts, synthesizes
in vector DB |
| Wiki pages created/updated
Query → Retrieve → Generate Cross-references maintained
| Contradictions flagged
Rediscover every time Index + log updated
Nothing compounds |
Knowledge COMPOUNDS forever
Karpathy's key insight: "The wiki is a persistent, compounding artifact. The cross-references are already there. The contradictions have already been flagged. The synthesis already reflects everything you've read."
Traditional RAG is amnesiac. Every query starts from zero — re-chunk, re-embed, re-retrieve, re-synthesize. The LLM Wiki compiles knowledge once, then keeps it current. When you ask a question, the answer is already half-synthesized in the wiki. The LLM just reads the relevant pages and connects the dots.
wiki/ ├── SCHEMA.md # Conventions, rules, tag taxonomy ├── index.md # Content catalog (one-line summaries) ├── log.md # Chronological action log (append-only) ├── raw/ # LAYER 1: Immutable source material │ ├── articles/ │ ├── papers/ │ └── transcripts/ ├── entities/ # LAYER 2: Wiki pages (LLM-owned) │ ├── anthropic.md │ └── karpathy.md ├── concepts/ │ ├── transformer-arch.md │ └── rag-vs-wiki.md ├── comparisons/ └── queries/
Immutable. The LLM reads but never modifies. Your source of truth. Includes SHA256 hashes so re-ingests can detect content drift.
LLM-owned markdown files. Entity pages, concept pages, comparisons. Every page has YAML frontmatter, wikilinks, and provenance markers.
The configuration file (like CLAUDE.md or AGENTS.md) that tells the LLM how to maintain the wiki. Defines conventions, page thresholds, tag taxonomy, update policies. This is what makes the LLM a disciplined wiki maintainer instead of a generic chatbot.
You drop a source (URL, PDF, paste). The LLM reads it, extracts key info, creates/updates wiki pages, adds cross-references, flags contradictions, updates the index and log. A single source might touch 10-15 wiki pages. The human stays in the loop — reads summaries, guides emphasis.
The LLM reads the index, finds relevant pages, synthesizes an answer with citations. Key insight: good answers get filed back as new wiki pages. A comparison you asked for, a connection you discovered — these compound into the knowledge base instead of disappearing into chat history.
Periodic health checks: orphan pages, broken wikilinks, stale content, contradictions between pages, missing cross-references, tag audit. The wiki stays healthy because the LLM does the maintenance that no human wants to do.
| Agent | Memory Mechanism | Type | Cross-Session | Compounds? |
|---|---|---|---|---|
| Claude Code | CLAUDE.md files + auto memory | Instructions + learned notes | ✅ Yes | ⚠️ Partially |
| Cursor | .cursorrules + project context | Rules file | ✅ Yes | ❌ No |
| GitHub Copilot | .github/copilot-instructions.md | Instructions file | ✅ Yes | ❌ No |
| Aider | .aider.conf + convention files | Config + auto-discovered | ✅ Yes | ❌ No |
| Codex CLI | AGENTS.md | Instructions file | ✅ Yes | ❌ No |
| Hermes (Bob) | MEMORY.md + skills + session search | Structured memory + skills | ✅ Yes | ⚠️ Partially |
Here are three concrete ways to integrate Karpathy's approach with Claude Code, from simplest to most ambitious:
Add a knowledge/ directory alongside your CLAUDE.md. Instruct Claude via CLAUDE.md to maintain it as a wiki. When you research something, ask Claude to file the findings. When Claude discovers something useful (a tricky API, a library quirk), it files that too.
my-project/ ├── CLAUDE.md # Instructions ├── knowledge/ # ← The wiki │ ├── SCHEMA.md │ ├── index.md │ ├── api-quirks/ │ ├── architecture/ │ └── debugging/ ├── src/ └── tests/
A shared wiki/ repo that the whole team's Claude Code instances contribute to. Add it as a git submodule or symlink. The schema defines who can write what. Claude instances across the team ingest sources, update entity pages, and maintain cross-references. Git handles version history and conflict resolution.
team-wiki/ ├── SCHEMA.md ├── index.md ├── raw/ ├── entities/ ├── concepts/ └── .git/ # In each dev's CLAUDE.md: # "Also read and maintain the wiki at ../team-wiki/"
Build or use an MCP server that exposes wiki operations as native tools. Claude Code calls wiki_ingest, wiki_query, wiki_lint as tool calls — no shell commands, no file path gymnastics. The MCP server handles indexing, search, and consistency checks internally.
# Claude Code settings
{
"mcpServers": {
"wiki": {
"command": "npx",
"args": ["-y", "llm-wiki-mcp-server", "~/wiki"]
}
}
}
# Now Claude can natively:
# - wiki_ingest(url) -- process and integrate a source
# - wiki_query(question) -- search and synthesize
# - wiki_lint() -- health check the wiki
Bob already has several memory primitives that map surprisingly well to the LLM Wiki pattern:
| Wiki Concept | Bob Already Has | What's Missing |
|---|---|---|
| Raw sources | ✅ session_search, web tools | No immutable raw/ archive |
| Wiki pages | ⚠️ MEMORY.md (monolithic) | Structured entity/concept pages |
| Schema | ✅ AGENTS.md + SOUL.md | No tag taxonomy or page thresholds |
| Index | ❌ None | No searchable content catalog |
| Log | ⚠️ Cron logs exist | No chronological wiki action log |
| Cross-references | ❌ None | No wikilinks between memory entries |
| Lint | ❌ None | No health checks on knowledge |
~/.hermes/ ├── MEMORY.md # Keep: lightweight facts (current) ├── wiki/ # NEW: compounding knowledge base │ ├── SCHEMA.md # Domain config, tag taxonomy │ ├── index.md # Content catalog │ ├── log.md # Action log │ ├── raw/ │ │ ├── articles/ # Web research results │ │ └── transcripts/ # Chat transcripts worth keeping │ ├── entities/ # People, services, tools │ │ ├── thota.md │ │ ├── cloudflare.md │ │ └── hermes-agent.md │ ├── concepts/ # Technical concepts │ │ ├── rag-patterns.md │ │ ├── mcp-protocol.md │ │ └── cron-architecture.md │ ├── projects/ # Project-specific knowledge │ │ ├── vps-setup.md │ │ └── telegram-bot.md │ └── skills-learned/ # Things Bob figured out │ ├── cloudflare-deploy.md │ └── heredoc-tricks.md
A Node.js CLI that compiles raw sources into an interlinked markdown wiki. Supports Anthropic, OpenAI, and Ollama backends. Obsidian-compatible out of the box. Inspired by Karpathy's pattern.
npm install -g llm-wiki-compiler export ANTHROPIC_API_KEY=*** llmwiki ingest https://some-article.com llmwiki compile llmwiki query "what is X?"
Local search engine for markdown files with hybrid BM25/vector search and LLM re-ranking. Has both a CLI (for agent shell commands) and an MCP server (for native tool integration). Perfect for scaling the wiki beyond the index file.
Anthropic's reference MCP server for knowledge-graph-based persistent memory. Stores entities, relations, and observations as a graph. Good for structured knowledge but lacks the wiki's narrative synthesis.
npx -y @modelcontextprotocol/server-memory
The wiki directory works as an Obsidian vault out of the box. Graph View visualizes connections. Dataview plugin enables queries over frontmatter. Web Clipper captures articles to raw/. For headless servers, use obsidian-headless for Obsidian Sync without a GUI.
Bob already has the llm-wiki skill — a full implementation of Karpathy's pattern with schema, index, log, three-layer architecture, ingest/query/lint operations, provenance markers, and frontmatter validation. It's ready to use right now.
| Feature | Traditional RAG | CLAUDE.md / Rules | MCP Memory (Graph) | Karpathy's LLM Wiki |
|---|---|---|---|---|
| Knowledge persists | ⚠️ In vector DB | ✅ As instructions | ✅ As entities | ✅ As wiki pages |
| Compounds over time | ❌ No | ❌ No | ⚠️ Additive only | ✅ Synthesis + cross-refs |
| Handles contradictions | ❌ No | ❌ No | ❌ No | ✅ Flags + tracks |
| Human-readable | ❌ Chunks in DB | ✅ Markdown | ⚠️ JSON graph | ✅ Full wiki |
| Works in Obsidian | ❌ No | ⚠️ Single file | ❌ No | ✅ Full vault |
| Zero infrastructure | ❌ Vector DB needed | ✅ Just files | ✅ Just MCP server | ✅ Just files |
| Search quality | ✅ Embeddings | ❌ Context window | ⚠️ Graph traversal | ⚠️ Index + grep (or qmd) |
| Scales to 1000+ docs | ✅ Yes | ❌ Context limited | ⚠️ Moderate | ⚠️ Needs search tool |
| Cross-source synthesis | ❌ Per-query | ❌ Not designed for it | ⚠️ Manual | ✅ Core feature |
| Agent maintenance burden | ✅ None | ✅ Minimal | ⚠️ Moderate | ⚠️ Significant (but automated) |
Create a wiki/ directory. Add a SCHEMA.md with your domain and conventions. Add an index.md and log.md. Drop your first source into raw/ and ask your LLM agent to ingest it. That's it — you're running the pattern.
mkdir ~/wiki/{raw/{articles,papers},entities,concepts,comparisons,queries}
# Write SCHEMA.md with your domain
# Drop a source: cp article.md ~/wiki/raw/articles/
# Ask your agent: "Ingest ~/wiki/raw/articles/article.md into the wiki"
When the index file isn't enough (50+ pages), add qmd for hybrid search. Or just use grep — it works surprisingly well on structured markdown with consistent frontmatter.
# Simple search that works today grep -r "transformer" ~/wiki/entities/ ~/wiki/concepts/ # Or install qmd for proper search npm install -g @anthropic/qmd qmd search "transformer architecture" ~/wiki/
Point your agent's CLAUDE.md (or AGENTS.md) at the wiki. Define ingest, query, and lint workflows. The agent reads the schema at session start and follows the conventions. Knowledge compounds from day one.
# In CLAUDE.md or AGENTS.md: ## Knowledge Wiki Maintain a knowledge wiki at ~/wiki/. - Read SCHEMA.md and index.md at session start - When researching, file findings as wiki pages - When learning something useful, add to concepts/ - Run lint weekly to check for stale content
Karpathy's pattern is spiritually connected to Vannevar Bush's Memex — a 1945 vision of a personal, curated knowledge store with associative trails between documents. Bush imagined something private, actively curated, with connections between documents as valuable as the documents themselves.
The part Bush couldn't solve: who does the maintenance? Humans abandon wikis because maintenance burden grows faster than value. LLMs don't get bored, don't forget to update a cross-reference, and can touch 15 files in one pass. The wiki stays maintained because the cost of maintenance is near zero.
Generated by Bob (Hermes Agent) — April 2026
Sources: Karpathy's LLM Wiki Gist • Claude Code Memory Docs • MCP Servers • llm-wiki-compiler • qmd