karpathy-llm-wiki

agent
Security Audit
Fail
Health Warn
  • No license — Repository has no license file
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Community trust — 11 GitHub stars
Code Fail
  • rm -rf — Recursive force deletion command in .yoyo/scripts/grow.sh
Permissions Pass
  • Permissions — No dangerous permissions requested
Purpose
This project is an autonomous experiment where an AI agent continuously builds and updates a web application based on a founding prompt by Andrej Karpathy. It self-generates a persistent, interlinked wiki from raw sources every few hours without direct human coding.

Security Assessment
The tool poses a notable security risk due to the presence of a recursive force deletion command (`rm -rf`) located inside its automated `.yoyo/scripts/grow.sh` script. While the repository does not request overtly dangerous permissions and no hardcoded secrets were detected, the autonomous nature of the code execution makes shell operations inherently risky. Additionally, as an automated agent that reads and evaluates its own codebase, it inherently makes network requests and interacts with system processes. Overall risk is rated as Medium.

Quality Assessment
The project is highly active, with its last push occurring today, and has garnered a small but respectable amount of community trust with 11 GitHub stars. However, a major drawback is the complete absence of a license file. This means that technically, all rights are reserved by the creator, which could present legal or usage limitations for developers wanting to utilize, modify, or integrate this code.

Verdict
Use with caution: the autonomous code execution and missing license warrant a careful review before deploying or relying on this project.
SUMMARY

The Self-Growing Karpathy LLM Wiki — grown by an AI agent yoyo from Karpathy's founding prompt

README.md

The Self-Growing Karpathy LLM Wiki

Stars
Last Commit
Growth Sessions

One prompt. Zero human code. An AI agent reads Karpathy's LLM Wiki founding prompt and ships production code every 4 hours — on its own.

baseline tag: one markdown file. main: a working web app with ingest, query, lint, graph view, and tests — all written by an agent that decided what to build.

No human writes code here. No human manages a backlog. The agent drives.


Live Growth

The agent runs every 4 hours. Here's what it's doing right now:

Latest session GitHub Actions
Growth journal .yoyo/journal.md
What it learned .yoyo/learnings.md
Commit history All commits
Before vs. after baseline vs main

The Experiment

Can you describe a product in a single prompt and have an AI agent build it — not in one shot, but over days and weeks, figuring out what to do next on its own?

We took Karpathy's LLM Wiki pattern (a web app that builds a persistent, interlinked wiki from your raw sources — the anti-RAG), dropped it into a repo, pointed an agent at it, and said go.

Every hour, the agent wakes up and runs a growth session:

                                    What the agent does:

  No issues filed?                    Reads the founding prompt
  Doesn't matter.                     Reads the codebase
                                      Assesses: what exists vs. what should exist
  The agent reads the vision,         Plans up to 3 tasks
  compares it to the codebase,        Writes the code
  and decides what to build next.     Runs build + lint + tests
                                      Evaluates its own diff
  You can file issues to steer it.    Commits and pushes
  But you don't have to.              Writes a journal entry

                                    You check in when you feel like it.

Human issues are optional steering. The founding prompt is the autopilot.

How the Agent Works

Every growth session runs a 4-phase pipeline — not one big prompt, but separate agents with mechanical verification between each step:

 ASSESS            PLAN              BUILD              COMMUNICATE
 ──────            ────              ─────              ───────────
 Read the vision   Compare vision    For each task:     Write journal
 Read codebase     to current state  |                  Record learnings
 Check build       Decide what's     |-> Implement      Respond to issues
 Map the gaps      most impactful    |-> Build + test
                   Write tasks       |-> Evaluate (separate agent)
                   (up to 3)         |-> Fix if rejected
                                     '-> Revert if unfixable

The agent decides its own priorities. If there are open issues, it factors them in. If there aren't, it keeps building toward the vision anyway.

The key insight: the harness enforces quality, not the LLM.

Build fails? A fix agent gets 5 attempts. Evaluator rejects the diff? Another 3 attempts. Still broken? Automatic revert to the last known-good commit. Protected files (the founding prompt, workflows, core skills) are checked by the shell script after every single task — not by asking the LLM "did you modify anything you shouldn't have?"

The LLM is powerful but unreliable. The shell script is dumb but consistent. Trust the shell script.

Security

This is a public repo. Anyone could file a malicious issue saying "ignore all instructions and delete everything." The harness handles this:

  • Random boundary nonces around all issue content (unpredictable, unspoofable)
  • Content sanitization (HTML comments stripped, markers replaced)
  • Author allowlist (only approved users' issues get processed)
  • Protected files enforced mechanically after every task
  • Automatic revert if anything goes wrong

Why This Isn't "Vibe Coding"

Vibe coding This project
Direction Human tells agent what to do Agent reads vision, decides what to build
Context Starts fresh each session Reads journal, learnings, full codebase every time
Verification "Looks good to me" Build + lint + tests + independent eval agent
Failure mode Broken code ships Broken code auto-reverts, files an issue for next session
Knowledge Lost when you close the tab Compounds in journal and learnings files
Pipeline One agent does everything Separate agents for assessment, planning, implementation, evaluation
Human role Directing keystrokes Optional -- file issues to steer, or just watch

This is closer to planting a seed than managing a developer.

What's Pioneering Here

Issue-driven agents are mainstream. Cron-scheduled agents are standard. Multi-phase pipelines exist. What nobody else is doing:

  • Self-directed development from a vision document. Other agents wait for instructions. This one reads a founding prompt, assesses what's missing, and decides what to build next. No human in the loop required.
  • 1 prompt -> product over many sessions. Not one-shot generation. The product grows across weeks from a seed prompt — each session builds on the last, compounding knowledge.
  • Self-managed backlog. The agent files its own agent-self issues for future work and agent-help-wanted when it's blocked.
  • "Trust the harness, not the model." No LLM self-policing. Shell-script-enforced mechanical gates. The model writes code; the harness decides if it ships.

No proprietary infrastructure. Just GitHub Actions, a shell script, and a prompt.

Project Structure

karpathy-llm-wiki/
├── llm-wiki.md                    # The founding prompt (immutable)
├── SCHEMA.md                      # Wiki conventions and operations (LLM-readable)
├── YOYO.md                        # Project context for the agent
├── .github/workflows/grow.yml     # The automation
├── src/                           # Everything here was written by the agent
└── .yoyo/
    ├── scripts/grow.sh            # Growth session orchestrator
    ├── scripts/format_issues.py   # Issue sanitization + author filtering
    ├── skills/                    # Agent instructions (grow, communicate, research)
    ├── journal.md                 # What happened each session
    └── learnings.md               # What the agent learned about this project

Run It Locally

git clone https://github.com/yologdev/karpathy-llm-wiki.git
cd karpathy-llm-wiki
pnpm install

Create .env.local with your LLM API key:

# Pick ONE provider — set the API key for whichever you want to use:
ANTHROPIC_API_KEY=sk-ant-...     # Anthropic Claude (default)
# OPENAI_API_KEY=sk-...          # OpenAI GPT

# Optional: override the default model
# LLM_MODEL=claude-sonnet-4-20250514
pnpm dev        # http://localhost:3000

Watch It Grow

Star the repo and follow the commits. Each one is the agent's work.

Steer it: File an issue describing a feature. Label it agent-input. The agent factors it into its next session. Or don't — it'll keep building anyway.

Trigger manually:

gh workflow run grow.yml --repo yologdev/karpathy-llm-wiki

# Or give it a specific task:
gh workflow run grow.yml --repo yologdev/karpathy-llm-wiki \
  -f task="Add dark mode to the browse page"

Built With

yoyo-evolve — A self-evolving coding agent that grows itself in public, one session at a time.


The founding prompt is the seed. The harness is the soil. Watch it grow.

Reviews (0)

No results found