Karpathy's LLM Wiki: Beyond RAG

AI Bot
By AI Bot ·

Loading the Text to Speech Audio Player...

The Problem with RAG

Most people's experience with LLMs and documents follows the same pattern: upload files, ask a question, get chunks retrieved, receive an answer. This is Retrieval-Augmented Generation (RAG), and it works — but it has a fundamental flaw.

RAG rediscovers knowledge from scratch on every question. There is no accumulation, no synthesis, no compounding. Each query starts from zero. The system never gets smarter.

Andrej Karpathy, co-founder of OpenAI and former Director of AI at Tesla, recently shared a different approach that has taken the AI community by storm — over 5,000 GitHub stars in 48 hours.

What Is the LLM Wiki Pattern?

The LLM Wiki is a pattern where an LLM does not just retrieve information — it actively maintains a structured, interlinked knowledge base. Think of it as hiring an infinitely patient librarian who reads every document you give them, updates every relevant page in your wiki, and never forgets to add cross-references.

The core insight is simple: humans curate and explore; LLMs handle the tedious bookkeeping.

Instead of vector embeddings and similarity search, the wiki uses plain Markdown files organized into summaries, entity pages, concept articles, and comparisons. Every claim can be traced back to a specific source file that a human can read, edit, or delete.

The Three-Layer Architecture

Karpathy's pattern is built on three distinct layers:

Layer 1: Raw Sources (Immutable)

Your curated collection of original documents — articles, papers, transcripts, notes, images. The LLM reads from this layer but never modifies it. This preserves the source of truth.

Layer 2: The Wiki (LLM-Generated)

Markdown files that the LLM owns entirely. It creates pages, updates them when new sources arrive, maintains cross-references, and ensures consistency across the entire knowledge base. Humans read; LLMs write.

Layer 3: The Schema

Configuration documents (like a CLAUDE.md file) that specify wiki structure, conventions, and workflows. This layer transforms the LLM from a generic chatbot into a disciplined knowledge maintainer.

Core Operations

The system supports three fundamental operations:

Ingest

When you add a new source, the LLM reads it, discusses key takeaways, writes summaries, updates the index, revises entity and concept pages, and appends changelog entries. A single source can touch 10 to 15 wiki pages simultaneously.

Query

Ask questions against the wiki. The LLM searches relevant pages through the index, synthesizes answers with citations, and — here is the key difference — files valuable findings back into the wiki as new pages. Your explorations compound in the knowledge base rather than disappearing into chat history.

Lint

Periodic health checks that identify contradictions between pages, stale claims superseded by newer sources, orphaned pages without inbound links, and data gaps that need attention.

LLM Wiki vs. Traditional RAG

AspectRAGLLM Wiki
Knowledge persistenceRe-derived every queryCompiled once, kept current
StructureFlat chunks in vector storeOrganized wiki with cross-references
TransparencyOpaque embeddingsPlain Markdown files
AccumulationNone — each query starts freshEvery source makes the wiki smarter
MaintenanceManual updates neededLLM handles bookkeeping
Scale sweet spotLarge document collectionsPersonal to departmental (around 100 articles, 400K words)

At the scale of a personal knowledge base — roughly 100 articles and 400,000 words — the LLM's ability to navigate via summaries and index files is more than sufficient. The "fancy RAG" infrastructure often introduces more latency and retrieval noise than it solves.

Real-World Use Cases

Karpathy describes several compelling applications:

  • Personal development: Track goals, health insights, and psychology through journal entries and curated articles. Build a structured picture of yourself over time.
  • Research: Read papers for weeks or months while an evolving wiki captures your developing thesis, with every new paper strengthening or challenging existing understanding.
  • Reading companion: Build a fan wiki as you read a book — characters, themes, plot threads, all cross-referenced automatically.
  • Business intelligence: Feed Slack threads, meeting transcripts, and customer calls into a wiki that stays current because the AI does the maintenance nobody wants to do.

Why the Maintenance Problem Matters

The reason most knowledge bases die is not lack of information — it is lack of maintenance. Cross-references go stale, new information contradicts old pages, and the overhead of keeping everything consistent grows faster than the value of adding new content.

LLMs solve this because they do not get bored. They do not forget to update a cross-reference. They can touch 15 files in a single pass. The maintenance burden that kills human-managed wikis becomes trivial for an AI agent.

Getting Started

The pattern is intentionally abstract — Karpathy shared it as an "idea file" rather than a rigid implementation. In the era of LLM agents, you share concepts, and each person's agent builds a customized version.

Here is a minimal starting point:

  1. Create a raw sources folder for your original documents
  2. Set up a wiki folder where the LLM will write Markdown files
  3. Write a schema document (CLAUDE.md or similar) describing the wiki structure, page types, and conventions
  4. Start ingesting — add one source at a time and let the LLM build out the wiki incrementally

The key is to start small. One topic. A few sources. Let the wiki grow organically as you explore.

The Bigger Picture

The LLM Wiki pattern represents a shift in how we think about AI and knowledge. Instead of treating LLMs as search engines that answer questions on demand, we treat them as knowledge workers that build and maintain persistent intellectual artifacts.

As Karpathy puts it: in this workflow, a large fraction of token throughput goes less into manipulating code and more into manipulating knowledge. The wiki becomes a living model of the domain itself — expressed in text, not tensors.

For anyone spending significant time researching, learning, or managing information, this pattern offers something RAG never could: knowledge that compounds.


Want to read more blog posts? Check out our latest blog post on Self-Hosted LLMs with Ollama: Run AI Locally.

Discuss Your Project with Us

We're here to help with your web development needs. Schedule a call to discuss your project and how we can assist you.

Let's find the best solutions for your needs.