LLM Wiki — Build Your Personal Knowledge Base

Problem › Architecture › 4-Phase Cycle › vs RAG › Compounding › Future

The Problem · ~12 min read

Your Knowledge Keeps Getting Lost

You read a paper, bookmark an article, save a thread — then never revisit it. What you learned evaporates. RAG tries to solve retrieval but creates new noise. Andrej Karpathy's LLM Wiki proposes something different: let an LLM compile your knowledge once, so you can query it forever.

TL;DR — The Idea in One Paragraph

Andrej Karpathy (ex-OpenAI, Tesla) shared a viral pattern: instead of using a RAG pipeline that retrieves raw documents on every query, build a personal wiki maintained by an LLM. Raw sources go in, the LLM reads them once and writes structured markdown pages (summaries, concept articles, comparisons, cross-references). Subsequent queries go against this compiled, synthesised knowledge — not the messy raw input. The result is a personal knowledge base that gets richer every time you add a source, answer a question, or run a health check.

~80%

Bookmarks never revisited after saving

~100

Articles where LLM wiki beats RAG (personal scale)

400K

Words — Karpathy's example wiki size

Vector databases needed

The Bookmarks Graveyard — Hover to See the Truth

You saved these articles meaning to read them later. Hover over each card to see when you actually last opened it:

Before & After — One Toggle

The same information, two very different fates. See what changes when the LLM Wiki takes over:

Scattered, unsearchable, evaporating

How Knowledge Decays Without a System

Without active synthesis, retention of information you've read drops rapidly. A compiled wiki externalises memory — the forgetting curve applies to your brain, not your knowledge base:

The LLM wiki acts as a perfectly maintained external memory — no decay, always queryable, increasingly cross-referenced as you add more sources.

Why Human-Maintained Wikis Always Die

The fundamental problem isn't the wiki format — it's the maintenance burden. Every new source potentially touches 10–15 existing pages. Humans abandon wikis when bookkeeping grows faster than value:

Why humans abandon wikis

📋 Cross-reference updates are tedious

🔗 Backlinks break silently

⏰ Summaries go stale as field evolves

🗂 Organisation inconsistency creeps in

😓 10 minutes of reading = 40 minutes of wiki work

Why LLMs don't

⚡ Touch 15 pages in one pass

🔄 Always update cross-references

📝 Rewrite summaries when new info arrives

🎯 Consistent formatting every time

😊 Zero boredom, zero procrastination

Why RAG Isn't the Answer at Personal Scale

Retrieval Noise

Embedding search finds similar text, not synthesised understanding. A question about "attention mechanisms" may retrieve 20 paragraphs with conflicting definitions.

No Synthesis

RAG retrieves raw sources and asks the LLM to synthesise every query. The same synthesis is repeated on every question — no compounding value.

Infrastructure Tax

Vector databases, embedding pipelines, chunking strategies, rerankers — at ~100 personal articles, this is more complexity than the problem warrants.

Questions Only YOUR Wiki Can Answer — Click Each

A generic LLM can't answer these. Neither can Google. Only a compiled personal knowledge base — trained on your exact reading history — can:

Click a question to see why only your wiki can answer it.

The Maintenance Debt Clock — Why Human Wikis Die

Every article you add to a human-maintained wiki creates a debt of updates. Watch the debt accumulate as sources grow:

Press "Watch It Fail" to see the human wiki maintenance trap.

The Compiler Analogy

"RAG is like asking someone to read all your notes from scratch every time you have a question. LLM Wiki is like compiling your notes into a structured program once — then running queries against the compiled output. The compilation cost is paid once; every subsequent query is fast, coherent, and cross-referenced."

So what does the LLM Wiki system actually look like? The 3-Layer Architecture →

The System Design

Three Layers, One LLM

The LLM Wiki has exactly three layers. The human curates sources and asks questions. The LLM writes and maintains everything in the middle. Click any layer to explore it:

The Three-Layer Architecture — Click a Layer

Click any layer above to explore its role in the system.

📁

Layer 1 — Raw Sources

Immutable. The LLM reads but never modifies these. Everything else is derived from here.

.md .pdf .png .csv .py

📖

Layer 2 — The Wiki

LLM-generated and maintained. Summaries, concept articles, comparisons, cross-references. Pure markdown.

index.md concepts/ log.md

📋

Layer 3 — The Schema

A CLAUDE.md or equivalent file telling the LLM how the wiki is structured, conventions, and workflows.

CLAUDE.md rules workflows

The Two Special Files: index.md and log.md

index.md — The Catalogue

## Wiki Index ### Concepts (47 pages) - [[attention]] — Self-attention mechanism overview - [[transformer]] — Full architecture walkthrough - [[grpo]] — Group Relative Policy Optimization ... ### Papers (23 summaries) - [[2402.03300]] — DeepSeekMath (Feb 2024) - [[1706.03762]] — Attention is All You Need ...

Updated on every ingest. The LLM's primary navigation aid for answering queries.

log.md — The Audit Trail

## Activity Log [2025-04-01 INGEST] karpathy-llm-wiki.md → Created: concepts/llm-wiki.md → Updated: index.md, concepts/rag.md → New links: 4 [2025-04-01 QUERY] "How does GRPO compare to PPO?" → Searched: grpo.md, ppo.md, rl-overview.md → Filed result: comparisons/grpo-vs-ppo.md

Append-only. Parseable with standard tools (grep, awk). Full history of every operation.

What a Wiki Article Looks Like

A concept page written and maintained by the LLM — not retrieved, but synthesised from multiple sources:

--- title: Transformer Architecture created: 2025-03-15 updated: 2025-04-01 sources: [1706.03762, karpathy-makemore, annotated-transformer] tags: [architecture, attention, foundational] --- # Transformer Architecture The transformer is an encoder-decoder architecture built entirely on attention mechanisms, introduced in "Attention is All You Need" (Vaswani et al., 2017). Unlike RNNs, transformers process all tokens in parallel, making them highly GPU-friendly. ## Key Components - **Multi-head self-attention**: Each token attends to all others simultaneously - **Position encoding**: Sinusoidal embeddings inject sequence order - **Feed-forward layers**: Per-token MLP applied identically ## Related Concepts → [[attention]] — The core mechanism explained in detail → [[grpo]] — Uses transformer policy for RL training → [[scaling-laws]] — How performance scales with transformer size ## Open Questions (as of 2025-04) - Long-context efficiency: quadratic attention is still expensive at 100K+ tokens

Folder Structure — Click Any File to Preview

A real LLM Wiki is just three folders and a config file. Click any file or folder to see what's inside:

Click any file on the left to preview its contents.

CLAUDE.md Schema Builder — Generate Yours

The CLAUDE.md file tells the LLM agent how your wiki works. Configure your setup and generate a real schema you can copy:

Your Setup

Domain

Ingest frequency

Wiki style

Team size

Generated CLAUDE.md

The Librarian — Click a Book on the Shelf

Think of the LLM as a tireless librarian: reads every new book, then updates the encyclopaedia. Click any book to watch it happen:

Click any book on the shelf to watch the LLM librarian read it and update the wiki.

Inside a Wiki Article — Click Any Section

Every article the LLM writes follows the same anatomy. Click a section to learn what goes there and why:

Click any part of the article to see what it contains and why it matters.

Architecture clear. Now how does the system actually operate day-to-day? The 4-Phase Cycle →

How It Works

The Four-Phase Cycle

Every interaction with the wiki follows one of four phases: Ingest a new source, Compile it into the wiki, Query the knowledge, or Lint for health. The cycle repeats — each pass makes the wiki richer.

The Cycle — Animated

Press Animate to see the four-phase cycle in motion.

Phase 1 — Ingest: Feeding the System

Drop a new source into raw/. The LLM reads it, discusses key takeaways, and decides which wiki pages to create or update. A single source may touch 10–15 existing pages:

Source Types

arXiv papers, web clips (Obsidian Web Clipper → .md), GitHub repos, datasets (.csv), code files (.py), images with captions

LLM Actions

Write a source summary page, update index.md, update relevant concept pages, add backlinks, log the ingest in log.md

Output

1 summary page + 3–15 concept page updates + new cross-references + log entry. Typical: <5 minutes of LLM time.

Phase 2 — Compile: Building the Knowledge Graph

The LLM doesn't just summarise — it synthesises. It finds connections across sources, writes concept articles that span multiple papers, and maintains a backlink graph between all pages:

The crucial difference from RAG: synthesis happens once at ingest time, not at every query. A concept page on "attention mechanisms" draws from 8 different sources, already reconciled and cross-referenced. Query time is fast and coherent.

Phase 3 — Query: Getting Answers + Filing Them Back

Ask a question. The LLM navigates via index.md, reads relevant concept pages, and synthesises an answer with citations. The key insight: good answers get filed back into the wiki as new pages, compounding value:

Enter a question and click Query Wiki to see how the system navigates and responds.

Phase 4 — Lint: Keeping the Wiki Healthy

Periodically run the LLM over the entire wiki as a health check. It finds issues humans would never catch manually:

Issues Found

⚡ Contradiction: paper A says X, paper B says ¬X

🔗 Orphan page: [[mamba.md]] has no incoming links

⏰ Stale claim: "GPT-4 is SOTA" (last updated 2023)

Actions Taken

✅ Flag contradiction, add note to both pages

✅ Add [[mamba.md]] link to architecture overview

✅ Web-search update, rewrite stale paragraph

Multi-Hop Query — Where RAG Fails, Wiki Wins

Some questions require reading across multiple documents and connecting ideas. RAG chunks documents — it can't bridge them. Watch the wiki navigate a 3-hop query:

Select a question above to see the multi-hop navigation path through the wiki.

A Day in the Life — Researcher + LLM Wiki

What does a typical research day look like when you have an LLM Wiki? Click any event on the timeline:

Click any event on the timeline to see what happens at that moment in your day.

This sounds like RAG — what's the actual difference? LLM Wiki vs RAG →

Head to Head

LLM Wiki vs RAG

Both systems help you query a large document collection. But their philosophies are fundamentally different — retrieve-and-synthesise on demand vs compile once and query forever.

Architecture Comparison — Side by Side

RAG Query Path

Question → embed → vector search → retrieve N chunks → LLM synthesises from scratch → answer (not stored)

Wiki Query Path

Question → LLM reads index.md → reads 2–4 concept pages → answers from synthesised knowledge → optionally files answer back

Same Question, Two Answers — See the Difference

Pick a question and see how RAG (raw retrieval) and LLM Wiki (compiled synthesis) respond differently:

Click a question above to compare the two responses side by side.

Token & Cost Calculator — RAG vs Wiki

RAG synthesises from raw chunks on every query. Wiki reads pre-compiled pages. The token difference adds up fast — adjust your usage to see when Wiki wins:

Queries per day20

Collection size (docs)80

Avg doc length (pages)8

When to Use Each

Collection size 100 docs

Full Comparison

Property	RAG	LLM Wiki
Synthesis cost	Per query (repeated)	Once at ingest
Infrastructure	Vector DB + embeddings	Markdown files + git
Scales to	Millions of documents	~100s of articles
Answer quality	Depends on retrieval quality	Rich, cross-referenced
Knowledge accumulates	No (raw chunks only)	Yes (every query enriches)
Human readable	No (vector embeddings)	Yes (browse in Obsidian)
Best for	Enterprise, large corpora	Personal, research, teams

The real magic: the wiki gets smarter over time. The Compounding Effect →

The Payoff

Compounding Knowledge

Unlike RAG (which resets every query), the LLM Wiki compounds. Each new source touches existing articles, creating new connections. Each answered question can become a new page. The value grows super-linearly with content.

Interactive: Watch Your Wiki Grow

Click "Add Source" to ingest a new document and see how articles, words, and backlinks grow. Notice how growth accelerates — each source finds more existing concepts to update:

Click Add Source to start building your wiki.

The Backlink Graph — Knowledge as a Network

Each wiki page links to related concepts. As the wiki grows, the graph reveals the shape of your knowledge — clusters, hubs, gaps. Click any node:

Click any node to see its connections and description.

Knowledge Galaxy — Watch It Form

Isolated facts become constellations of understanding. Every source you add creates new connections. Press Play to watch your knowledge galaxy form:

Wiki Health Dashboard — Run a Lint Check

Select how old your wiki is, then run a health check to see what the LLM finds and fixes:

Select a wiki age and click Run Lint Check.

Why Value Grows Super-Linearly

Each new page n can potentially link to all n−1 existing pages. So the number of possible cross-references grows as O(n²). At 10 pages: 45 possible connections. At 100 pages: 4,950. The LLM finds and maintains the meaningful ones — the human only reads the resulting dense, interconnected knowledge graph.

Where does this pattern go next? The Future →

What Comes Next

From Wiki to Personalised Intelligence

Karpathy's vision doesn't stop at a well-maintained markdown directory. The LLM Wiki is the foundation for something more ambitious: a model that knows your knowledge in its weights, not just its context.

The Memex Vision: 80 Years in the Making

Vannevar Bush described the Memex in 1945 — a personal, curated knowledge store with "associative trails" between documents. The web gave us shared knowledge but abandoned personal curation. LLM Wiki might finally realise Bush's original vision:

Karpathy's Roadmap: Wiki → Fine-Tuned Personal Model

The LLM wiki is a stepping stone. The compiled, structured wiki is also ideal synthetic training data — enabling a fine-tuned model that "knows" your domain in its weights:

The end state: A model fine-tuned on your personal wiki that doesn't need to be prompted with context — it already knows your research domain, your terminology, your preferred frameworks, your open questions. The wiki and the model are the same thing, just in different forms.

Use Cases — Who Is This For?

🔬 Researchers (deepest use case) ▾

Going deep on a topic over weeks or months. Each paper ingested adds to a growing concept graph. Query: "What are the open problems in mechanistic interpretability and which of my sources address them?" — answered from your personal synthesis, not a generic web search.

📚 Book readers ▾

Build a companion wiki while reading a book series or curriculum. Add a chapter, get a summary page. Ask "How does this chapter's argument relate to chapter 3?" — the wiki already has chapter 3's summary and can cross-reference them.

🏢 Teams and organisations ▾

Internal wikis maintained by LLMs, fed by Slack threads, meeting transcripts, design documents. Every new meeting creates a summary page that updates existing concept pages. New hires ask the wiki, not Slack. The wiki knows everything the team has ever discussed.

🎯 Competitive analysis / due diligence ▾

Track a competitor's blog posts, job listings, product updates, investor filings. The LLM builds a wiki about your competitor — concept pages for their product areas, timelines, team moves. Query: "What signals suggest they're moving into enterprise?"

🏥 Personal health / self-tracking ▾

Log health data, notes, lab results, supplement research. The wiki tracks your interventions and outcomes. Query: "What correlates with my better sleep nights based on my 6-month log?" — answered from your personal data, not generic advice.

Your First Wiki in 30 Minutes — Interactive Checklist

Everything you need to go from zero to a queryable personal knowledge base. Check off each step as you complete it:

The Team Wiki Multiplier

One person's ingest benefits everyone's queries. A shared wiki compounds faster than a solo one. See how team size changes the value curve:

Team members1

Shared articles / week10

"Ask Karpathy's Wiki" — Simulated Navigation

Karpathy's real wiki spans ~400K words on ML topics. Select a question and watch how a wiki agent would navigate it — reading index.md, following concept pages, synthesising an answer:

Select a question to simulate navigating Karpathy's ~400K-word wiki.

"Is LLM Wiki for Me?" — Quick Quiz

Answer 3 quick questions and get a personalised recommendation:

1. How do you usually consume information?

Time Saved Calculator

How much time would an LLM Wiki actually save you each month? Adjust to your situation:

Articles read per week5

Minutes spent taking notes per article15m

Team members sharing the wiki1

"What Would My Wiki Look Like?" — Personaliser

Select your interest areas and see a preview of your personal wiki graph:

Recommended Toolstack

Viewing & Editing

Obsidian — Graph view, Dataview queries, local-first

Obsidian Web Clipper — Convert articles to .md with local images

Marp — Turn wiki pages into slide decks

LLM & Versioning

Claude Code / Cursor — LLM agent with file-system access for ingest/lint

Git — Version history, diffs, collaboration

CLAUDE.md — Schema file giving LLM agent its instructions

The big takeaway: The tedious part of knowledge management is bookkeeping. Humans are terrible at it and abandon the effort. LLMs are perfect at it and never tire. Karpathy's LLM Wiki redraws the human-LLM division of labour: humans curate sources, ask good questions, and think about meaning. LLMs write, update, cross-reference, and maintain everything else. The result is the personal knowledge base that the internet era promised but never delivered.