BoilingBrain

BoilingBrain wiki graph — domain clusters auto-emerge from the linking structure

Bootstrap template for an LLM Wiki — a personal knowledge base curated by you and maintained by LLM agents.

Status

v1.1.0 — major release: an MCP server for cross-project wiki access (tiered loading, ~96% token reduction), a session lifecycle (Stop/SessionStart hooks + /compress-bb), /domain lifecycle commands, L3 readiness (ADR verdict tracking), and a CI revamp for living vaults (Obsidian-safe Prettier formatter, semantic-only linting, deterministic wiki validation). The template works end-to-end and has scaffolded real vaults. See CHANGELOG.md for the full milestone list. Bug reports and generic-improvement PRs welcome — see CONTRIBUTING.md.

What is an LLM Wiki?

A vault where:

You curate sources (notes, transcripts, articles, repo docs) into an immutable raw/ folder.
Domain expert agents ingest those sources and write the wiki layer (wiki/sources/, wiki/concepts/, wiki/entities/, etc.) — one agent per domain you care about.
Slash commands (/ingest, /query, /lint, /evolve-agent, …) orchestrate the agents.
The wiki is always derivable from raw/ — no orphan knowledge, no hallucinated links.

The pattern follows Karpathy's "knowledge compilation" idea: keep the source of truth compact, hash-addressed and immutable; let LLMs maintain a queryable layer on top.

How does this differ from Karpathy's LLM Wiki?

Karpathy's LLM Wiki is a concept: notes maintained by an LLM, with the LLM filling cross-references and refactoring on demand. BoilingBrain is an opinionated, runnable implementation of that concept — with concrete trade-offs you can either accept or fork. Specifically :

Dimension	Karpathy's sketch	BoilingBrain
Source of truth	Notes in a folder, LLM rewrites freely	`raw/` is immutable, hash-addressed (`source_sha256`); the wiki layer is always derivable
Agent topology	One LLM doing everything	One expert agent per domain, declared at bootstrap, with deliverable signatures (cheatsheets, syntheses, diagrams) and per-domain prompts that evolve over time via `/evolve-agent`
Idempotence	Manual	`/ingest` is hash-based and idempotent — re-running doesn't duplicate; `--force` re-applies new agent reflexes to existing sources
Multimodal	Text-only typically	First-class video pipeline (`/ingest-video`): download → transcribe → frame extraction (mode A declarative or mode B image-diff induction) → markdown transcription of visuals (tables, Mermaid, LaTeX) so queries don't re-OCR each time
Queries at scale	Load and read	Tiered loading: every page carries `summary_l0` (≤140 chars) + `summary_l1` (~50-150 words). Agents scan a domain via TOC L0, descend to L1 then L2 only when relevant — sublinear in body size
External code/docs	Out of scope	`/sync-repos` snapshots GitHub repos by SHA into `raw/tracked-repos/<sha>/`, immutable, never overwritten
Self-improvement	Implicit, ad-hoc	Explicit human-curated loop: each agent appends `Evolution suggestions` per ingest, `/evolve-agent <domain>` reviews + applies the diff, archives integrated suggestions
Architectural decisions	Mixed into notes	ADR-lite in `wiki/decisions/` with a fixed structure (Problem → Options → Decision → Why → Open questions)

Said otherwise: Karpathy says "let LLMs maintain a wiki." BoilingBrain says "here's what the wiki layout, the agent contract, the ingest protocol and the evolution loop need to look like for that to actually scale past a few weeks of use." The opinions come from real usage — fork them if they don't fit.

Prerequisites

Claude Code — the CLI agent that drives the interview and the wiki workflows. The template is built around its slash-commands and AskUserQuestion tool.
Obsidian — the markdown editor where your vault lives day-to-day. Wiki pages use [[wikilinks]] and the bootstrap generates an .obsidian/ config (graph filter + per-domain colors) so the graph view shows your wiki/ clean of raw/ and cache/ noise.
gh CLI (optional) — only needed if you want to clone via gh repo clone and create a remote vault repo automatically at the end of the interview.

After bootstrap, open the cloned folder in Obsidian ("Open folder as vault") and switch to the graph view (icon in the left ribbon, or Ctrl/Cmd+G). You'll see your domains laid out by color — that's the auto-generated .obsidian/graph.json doing its job.

Quick start

gh repo clone tetra-plg/boiling-brain ~/my-vault
cd ~/my-vault
claude

Then in Claude Code:

Read BOOTSTRAP.md and run the prompt.

Works in any language — the bootstrap interview adapts to your phrasing.

The interview takes 5-10 minutes. At the end your vault is personalized, the template's git history is reset, and you can run your first /ingest. Then open ~/my-vault in Obsidian and check the graph view.

Detailed flow → How to bootstrap below.

Why a template?

This repo is the scaffolding, not a usable instance. It contains:

Generic slash commands (/ingest, /ingest-video, /query, /save, /lint, /evolve-agent, optional /sync-repos).
Generic scripts (scan-raw.sh, transcribe.sh, sample-frames.sh, diff-frames.py, extract-frames.sh, backfill-summaries.py, enrich-hub.py, optional sync-repos.sh).
Templated files (*.tpl) with {{placeholder}} markers that need to be filled with your name, role, domains, and projects.
A unified domain-expert.md.tpl that gets instantiated once per domain you declare at bootstrap time, with domain-specific deliverables, observation triggers and frame-visual formats.

Do NOT push on this repo directly

This template is not meant to be cloned and edited by hand. Each placeholder has dependencies on others (e.g. domain count drives how many agent files exist; the hub-pivot domain gets a special marker; /sync-repos is included only if you declare GitHub repos to track).

A bootstrap prompt (BOOTSTRAP.md, shipped at the root of this repo) walks you through a structured interview, generates the resolved files, and writes them to your machine.

Among other questions, the bootstrap asks whether you want to track GitHub repos (i.e. snapshot their docs by SHA into raw/); if yes, /sync-repos is included. Otherwise it's removed.

At the end of the interview, the .git will be removed or replace by your own git remote.

How to bootstrap

See Quick start above for the commands. A few details on what happens during and after the bootstrap:

BOOTSTRAP.md and PLACEHOLDERS.md are moved into wiki/decisions/ as ADR traces — you can re-read them later to understand how your vault was generated.
The template's .git/ is replaced by a fresh history (clean start, no leftover template commits).
An optional GitHub remote is created via gh repo create if you confirm during the interview.

If you really want to clone manually and skip the bootstrap, read every *.tpl file, fill placeholders by hand, and rename to drop .tpl. You will probably forget the cross-file consistency checks the bootstrap prompt enforces.

Architecture in 60 seconds

.claude/
  agents/        # one <domain>-expert.md per domain you declared
  agent-memory/  # one MEMORY.md per agent (state of the domain)
  commands/      # slash commands (generic, no per-instance customization)
  rules/         # transverse conventions auto-loaded by Claude Code via `paths` frontmatter
  template-version  # which template version this vault is aligned with (used by /update-vault)

raw/             # IMMUTABLE source files. Never modified after first write.
cache/           # transient artifacts (downloaded videos, audio, frames). gitignored.

wiki/            # the LLM-maintained layer.
  index.md       # minimal portal: overview, radar, log, domains
  log.md         # chronological journal
  radar.md       # open questions and points of attention
  overview.md    # your portrait
  domains/       # one hub per domain
  sources/ entities/ concepts/ syntheses/ cheatsheets/ diagrams/ decisions/

scripts/         # utilities (transcription, frame extraction, image-diff, summary backfill, hub enrichment, ...)

CLAUDE.md        # project-wide LLM instructions (slim ≤200 lines, points to .claude/rules/ and .claude/commands/)
tracked-repos.config.json  # OPTIONAL: list of GitHub repos to snapshot via /sync-repos

`.claude/rules/` — transverse conventions

Conventions that apply across the wiki (frontmatter format, slug rules, raw immutability) live in .claude/rules/*.md. Each rule has a paths: field in its frontmatter — Claude Code auto-loads the rule when the current working file matches one of those paths. This mirrors the Anthropic-recommended pattern (Boris Cherny, "CLAUDE.md best practices", 24 March 2026).

Three rules ship by default:

frontmatter.md (paths: wiki/sources/**, wiki/concepts/**, wiki/syntheses/**, wiki/decisions/**, wiki/entities/**, wiki/cheatsheets/**, wiki/diagrams/**, wiki/domains/**) — frontmatter YAML rules, including the hard rule that source_sha256 must always be computed via shasum -a 256 <file> (never a placeholder).
pages-wiki.md (paths: wiki/**) — kebab-case slugs, [[wikilinks]], [!warning] / [!question] callouts, page sizing.
raw-vs-cache.md (paths: raw/**, cache/**) — strict immutability of raw/, transient nature of cache/.

.claude/rules/ is upstream-tracked — /update-vault propagates new and updated rules to existing vaults automatically.

Tiered loading

Every wiki page carries two extra frontmatter fields:

summary_l0 — single line, ≤140 chars. Telegraphic. Used as a TOC entry when an agent scans a list of pages.
summary_l1 — 2-5 sentences (~50-150 words). Used when the agent decides whether to load the full body.

This lets agents (and you, via /query) navigate the wiki without paying the full body cost on every page they consider. Starting in v1.1.0, the MCP server pushes this further by exposing a hierarchical orient → drill → preview → read pattern across 12 tools (~96% token reduction vs flat dumps on large domains). See docs/mcp-tiered-loading.md for the full pattern.

Scripts layout

The scripts/ directory is organised by feature, not by verb. The convention is:

scripts/video/ — video and frame extraction pipeline (extract-frames.sh, sample-frames.sh, diff-frames.py, transcribe.sh).
scripts/wiki-maint/ — wiki maintenance utilities (backfill-summaries.py, enrich-hub.py, scan-raw.sh, scan-domain-refs.sh).
scripts/mcp/ — MCP server and its installer (mcp-wiki.py, setup-mcp.sh).
scripts/hooks/ — Claude Code hooks (e.g. check-session-activity.sh).
scripts/migrations/ — versioned migration slash-commands invoked by /update-vault.
scripts/sync-repos.sh — standalone CLI tool, kept at the root.

New scripts should be placed in the existing feature directory that best fits their role, or at the root only if they are standalone tools with no family. Avoid adding flat scripts at the root.

CI for living vaults

The CI (.github/workflows/lint.yml) blocks only on repairable, meaningful defects — a remote vault is a read-only artifact (raw/ and the LLM are local-only), so CI does deterministic validation only:

format-check — Prettier, Obsidian-safe (via scripts/wiki-maint/format-md.py): markdown stays clean by construction without breaking [[wikilink|alias]] or code-span pipes in tables.
markdownlint — semantic rules only (MD056, MD042, MD051, MD024); cosmetic rules delegated to Prettier.
wiki-integrity — scripts/wiki-maint/validate-wiki.py: broken [[wikilinks]], internal links, frontmatter conformance, and leftover git conflict markers. Skips raw/.
link-check-report — weekly, non-blocking (lychee): external links surfaced as a report, never failing the push.

Run /format to normalise a pre-formatter vault; generation commands (/ingest, /save, /evolve-agent) format their output automatically.

Slash commands shipped

Command	Purpose
`/ingest [path]`	Batch idempotent ingestion of files from `raw/` via domain experts. Hash-based, no duplicates.
`/ingest-video <path-or-url>`	Pipeline: download → transcribe → ingest transcript → propose frame extraction (mode A / B / skip).
`/query <question>`	Answer from indexed pages with citations; optionally archive the synthesis.
`/save <slug>`	Archive the current synthesis into `wiki/syntheses/<slug>.md`.
`/lint [domain]`	Detect contradictions, orphans, missing cross-references, gaps.
`/evolve-agent <domain>`	Curated update to a domain expert's prompt, fed by accumulated `.suggestions.md`.
`/domain <add\|rename\|remove> <slug>`	Manage a domain's lifecycle post-bootstrap. Scans the vault, presents impact by bucket (canonical / frontmatter / wikilink / alias / composed / prose / log-tag / historical / drift), validates ambiguous cases case-by-case.
`/sync-repos [names]` (optional)	Snapshot GitHub repos by SHA into `raw/tracked-repos/` (or any `dest` declared per source).
`/update-vault`	Cherry-pick improvements from the upstream template into your vault instance (versioned migration machine).
`/create-issue [type]`	Sanitize a draft and open an issue on the upstream template repo (auto-strips wikilinks, vault-specific slugs, private paths). Always validated by you before `gh issue create`.
`/compress-bb`	Save the current session journal to `raw/notes/sessions/` for later ingestion.
`/format [path]`	Normalise markdown via Prettier (Obsidian-safe: preserves pipes in `[[wikilink\|alias]]` and code spans). On-demand or one-shot for a pre-formatter vault. Excludes `raw/`, `cache/`, `.claude/worktrees/`, `.venv*`.

MCP server (optional, v1.1+)

Run bash scripts/mcp/setup-mcp.sh once after bootstrap to register the boiling-brain-wiki MCP server (user-scope). Once active, Claude Code can query your wiki from any project — not just inside the vault directory.

12 tools are exposed, organised as a tiered-loading hierarchy (orient → drill → read). Full reference: docs/mcp-tiered-loading.md.

Tool	Purpose
`scan_domain(domain)`	Orient: hub `summary_l1` + counts per type + top 10 pages by centrality (backlinks). ~860 tokens. Use FIRST.
`scan_concepts(domain, query="", top=20)`	Drill: concepts in a domain, ranked by centrality. Optional query (case + separator insensitive).
`scan_entities`, `scan_decisions`, `scan_syntheses`, `scan_cheatsheets`, `scan_diagrams`	Same semantics, per type.
`scan_sources(domain, query, top=20)`	Same shape but query is required (sources too numerous without a target).
`preview_page(page_path)`	L1: frontmatter (whitelisted fields) + `summary_l1`.
`read_page(page_path)`	L2: full body.
`search_wiki(query, limit=10)`	Cross-type, cross-domain. Format enriched: path, type, summary_l0, outgoing wikilinks.
`drop_to_raw(subfolder, filename, content)`	Sanctioned write into `raw/` (server-side, bypasses the `protect-raw.sh` hook by design). Auto-signals via `cache/.pending-ingest`.

Recommended pattern: scan_domain → scan_<type>(query=...) → preview_page → read_page. Measured: ~96% token reduction vs a flat dump on a 388-page domain.

A standalone smoke test (scripts/mcp/smoke_test.py) asserts per-tool token budgets against any vault — run it after any non-trivial change to mcp-wiki.py.

Workflow loop

Drop a source into raw/ (note, transcript, PDF, repo doc snapshot).
Run /ingest. Main context proposes a domain expert; you confirm via AskUserQuestion. Agent writes wiki/sources/, wiki/concepts/, wiki/entities/, optionally cheatsheets / syntheses / diagrams. Open questions land in wiki/radar.md.
Tomorrow morning, ask "show me the radar" — Claude reads radar.md and the accumulated .suggestions.md of each agent, proposes the day's priorities.
After a few ingestions in a domain, run /evolve-agent <domain> to fold accumulated suggestions back into the expert's prompt.
Use /query whenever you need to answer something from the corpus. Substantial answers can be archived via /save.

L3 readiness

The template ships with opt-in scaffolding for the L3 layer of personal knowledge management (model revision via real-world feedback): verdict frontmatter on ADRs, revisit_after on decisions and concepts, ## Real feedback section in the ADR template. The template does not enforce L3 — only you can judge if a decision held up — but /lint surfaces stale ADRs (≥ 90 days without verdict) and overdue revisits to prompt confrontation with reality.

Who is this for?

For you if:

You're a solo dev, researcher, PM or writer building a personal knowledge base across 3-6 domains.
You already use Claude Code (CLI, desktop app, IDE extension, or claude.ai/code) and Obsidian — or want to.
You want a thinking partner that grounds its answers in your sources (with citations), not a chatbot guessing from training data.
You'd rather have an opinion than a blank page: hash-keyed raw/, one expert agent per domain, mandatory frontmatter — these constraints feel like a feature, not a friction.
You want a vault that scales past a few weeks without falling into the "200 unlinked notes" trap.
You enjoy owning your wiki layer (the LLM writes it, you curate the prompts and the diff).

Probably not for you if:

You want a team or shared wiki — BoilingBrain is single-tenant by design (one agent set per vault).
You want a one-click hosted product (Mem, NotebookLM, ChatGPT projects) where you upload notes and the storage is abstracted away — here the wiki is yours, plain markdown, you read and edit it directly.
You're expecting a vector embeddings / RAG retrieval layer — there's none, on purpose. For a personal knowledge base at this scale, hash-keyed sources + LLM-maintained wikilinks are simpler and effective; full RAG is overkill.
You don't use Claude Code (Codex, Cursor, Aider support is on the roadmap — contributions welcome, see open issues).
You want a no-code, no-config experience — every action goes through a slash command and you'll edit YAML frontmatter.

FAQ

Do I need to know which domains I want before bootstrapping?

Roughly. The bootstrap interview will ask you to list 3-6 domains with a one-line description each, mark one as "hub pivot" if you have a domain that feeds the others, and indicate per-domain deliverables (cheatsheets? diagrams? syntheses?). You can add domains later by writing a new agent file by hand — it's just a copy of an existing one with the placeholders filled differently.

Why one agent per domain instead of a single generic agent?

Because each domain has different observation triggers, deliverable signatures, and visual frame formats. A poker source needs a 13×13 range grid; an astrophysics source needs LaTeX equations and EXIF metadata; a management source has confidentiality concerns. A single prompt that tries to cover all of these dilutes the signal. The unified domain-expert.md.tpl solves this by being templated rather than generic at runtime.

What about the `/evolve-agent` loop — is it dangerous?

No. /evolve-agent <domain> reads the suggestions accumulated by an expert across ingestions, proposes a diff, and waits for your approval before writing. The previous prompt is preserved (suggestions integrated are archived in a separate file). It's an explicit, human-curated improvement loop, not silent self-modification.

Can I keep my `raw/` folder separate from this repo?

Recommended actually — raw/ is gitignored by default (see .gitignore). Your sources are personal; commit only the wiki/ layer if at all. Many users keep the whole vault in a private repo separate from the template clone.

How do I handle large videos?

The LLMWIKI_VIDEO_CACHE environment variable points to a directory where video files are stored. Default is cache/videos/ (in-vault). Override to an external SSD or a dedicated cache directory if you process many or large videos. The wiki itself never references cache/ — only raw/transcripts/ and raw/frames/ are referenced.

How do I add my own domain-specific tools?

Drop them in .claude/commands/ (slash-commands) or scripts/ (utilities). They live in your instance and never need to be merged back into the template.

Concrete example: in one of the early instances of this template, the user added /extract-range-grid — a poker-specific OCR pipeline for range matrices — alongside the core pipeline. The slash-command (.claude/commands/extract-range-grid.md) and the Python script (scripts/extract-range-grid.py) live in their vault, not here. Yours can do the same for whatever your domains need: a LaTeX renderer, a k8s manifest validator, a financial-statement parser — anything that's a natural extension of one of your domains.

How do I clip a web page into the vault?

The recommended flow uses the Obsidian Web Clipper browser extension (Chrome/Firefox/Safari):

Clip — click the extension on any article, Reddit post, or documentation page. It saves a markdown file (with YAML frontmatter: title, source, published, author, tags) into your vault's Clippings/ folder by default.
Move to raw/ — drag the file from Clippings/ into the right sub-folder of raw/:
- raw/notes/ for personal takes, Reddit threads, short reads
- raw/articles/ for long-form articles and documentation pages (create the folder if it doesn't exist yet)
- raw/pdfs/ if you clipped a PDF-backed page (keep the PDF there too)
Ingest — run /ingest raw/notes/your-clip.md (or just /ingest to sweep all new files). The main context scans the file, proposes a domain expert, and you confirm before the agent writes the wiki pages.

The Clippings/ folder is gitignored — it's a staging area, not part of the vault. raw/ is your immutable archive once a file lands there.

Tip: if you use multiple devices, keep Clippings/ synced via Obsidian Sync or iCloud, then move files to raw/ only when you're at your main workstation where Claude Code runs.

What about secrets / credentials?

Don't commit them. Use the .gitignore (already excludes cache/, raw/, Clippings/). For repo-syncing via /sync-repos, authentication relies on gh auth login — no tokens stored in the vault.

License

MIT. See LICENSE (added by GitHub when the repo was created).

Contributing

This template is opinionated. The opinions come from real usage. If you have a generic improvement (better hub structure, better tiered-loading default, a missing slash command), feel free to open an issue or PR. Domain-specific tooling — keep it in your own instance and document it in your README.

Generated as part of the Phase 2 scaffolding of the LLM Wiki bootstrap.