llm-wiki-agent

agent
Security Audit
Pass
Health Pass
  • License — License: MIT
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Community trust — 1193 GitHub stars
Code Pass
  • Code scan — Scanned 4 files during light audit, no dangerous patterns found
Permissions Pass
  • Permissions — No dangerous permissions requested
Purpose
This tool is a personal knowledge base agent that automatically reads, extracts, and organizes information from your documents into a persistent, interlinked wiki structure. It acts as an interface for various coding agents (like Claude Code or Gemini CLI) or runs as a standalone Python script.

Security Assessment
Overall Risk: Low. The light code audit scanned four core files and found no dangerous patterns, hardcoded secrets, or requests for dangerous permissions. To function, the standalone Python scripts require a manually provided API key (via standard environment variables) to make external network requests to AI models. When used via coding agents, it operates locally within your file system, reading from a designated raw folder and writing standard Markdown files. No hidden shell executions or unauthorized data collection were identified.

Quality Assessment
Overall Quality: High. The project is actively maintained, with repository pushes occurring as recently as today. It carries a permissive MIT license, making it safe for both personal and commercial use. It has garnered nearly 1,200 GitHub stars, which indicates a strong level of community trust and validation. The repository includes clear documentation and setup instructions for multiple environments.

Verdict
Safe to use.
SUMMARY

A personal knowledge base that builds and maintains itself. Drop in sources — Claude (or Codex/Gemini) reads them, extracts knowledge, and maintains a persistent interlinked wiki. Works with Claude Code, Codex, OpenCode, Gemini CLI. No API key needed.

README.md

LLM Wiki Agent

License

A personal knowledge base that builds and maintains itself. Drop in source documents — articles, papers, notes — and the LLM reads them, extracts the knowledge, and integrates everything into a persistent, interlinked wiki. You never write the wiki. Claude does.

Unlike RAG systems that re-derive knowledge from scratch on every query, LLM Wiki Agent compiles knowledge once and keeps it current. Cross-references are pre-built. Contradictions are flagged at ingest time. Every new source makes the wiki richer.

How It Works

You drop a source → Claude reads it → wiki pages are created/updated → graph is rebuilt

You ask a question → Claude reads relevant wiki pages → synthesizes answer with citations

Three layers:

  • raw/ — your source documents (immutable, you own this)
  • wiki/ — Claude-maintained markdown pages (Claude writes, you read)
  • graph/ — auto-generated knowledge graph visualization

Quick Start — Any Coding Agent (no API key needed)

Works with Claude Code, Codex, OpenCode, Gemini CLI, and any agent that reads a config file from the repo root.

Agent Config file read automatically
Claude Code CLAUDE.md + .claude/commands/
OpenAI Codex AGENTS.md
OpenCode / Pear AI AGENTS.md
Gemini CLI GEMINI.md
Any other agent Point it at AGENTS.md or README.md
git clone https://github.com/SamurAIGPT/GPT-Agent.git
cd GPT-Agent

claude          # Claude Code
codex           # OpenAI Codex
opencode        # OpenCode
gemini          # Gemini CLI

Each agent reads its config file and follows the same workflows. Then talk to it:

# Claude Code slash commands:
/wiki-ingest raw/articles/my-article.md
/wiki-query what are the main themes across all sources?
/wiki-lint
/wiki-graph

# Any agent (plain English):
"Ingest this paper: raw/papers/my-paper.md"
"What does the wiki say about X?"
"Check for contradictions"
"Build the knowledge graph"

Quick Start — Standalone Python (requires API key)

pip install -r requirements.txt
export ANTHROPIC_API_KEY=your_key_here

python tools/ingest.py raw/articles/my-article.md
python tools/query.py "What are the main themes?"
python tools/query.py "How does X relate to Y?" --save
python tools/build_graph.py --open
python tools/lint.py --save

Architecture

raw/                    ← your sources (never modified by LLM)
wiki/
  index.md              ← catalog of all pages (updated on every ingest)
  log.md                ← append-only operation log
  overview.md           ← living synthesis across all sources
  sources/              ← one page per source document
  entities/             ← people, companies, projects
  concepts/             ← ideas, frameworks, methods
  syntheses/            ← answers to queries, filed back as pages
graph/
  graph.json            ← node/edge data (SHA256-cached)
  graph.html            ← interactive vis.js visualization
tools/
  ingest.py             ← process a new source
  query.py              ← ask a question
  lint.py               ← health-check the wiki
  build_graph.py        ← rebuild the knowledge graph
CLAUDE.md               ← schema and workflow instructions for the LLM

Commands

Claude Code (primary — no API key)

Slash command What it does
/wiki-ingest <file> Read a source, update wiki pages, append to log
/wiki-query <question> Search wiki, synthesize answer with citations
/wiki-lint Check for orphans, broken links, contradictions, gaps
/wiki-graph Build knowledge graph (graph.json + graph.html)

Or describe what you want in plain English — Claude Code follows CLAUDE.md and does the right thing.

Standalone Python (optional — requires ANTHROPIC_API_KEY)

Command What it does
python tools/ingest.py <file> Ingest a source
python tools/query.py "<question>" Query the wiki
python tools/query.py "<question>" --save Query and file answer back
python tools/lint.py Lint the wiki
python tools/build_graph.py Build graph
python tools/build_graph.py --no-infer Build graph (skip inference, faster)
python tools/build_graph.py --open Build and open in browser

The Graph

build_graph.py runs two passes:

  1. Deterministic — parse all [[wikilinks]] in every page → explicit edges tagged EXTRACTED
  2. Semantic — Claude infers implicit relationships not captured by wikilinks → edges tagged INFERRED (with confidence) or AMBIGUOUS

Community detection (Louvain) clusters nodes by topic. The output is a self-contained graph.html — open it in any browser. SHA256 caching means only changed pages are reprocessed.

CLAUDE.md

CLAUDE.md is the schema document — it tells the LLM how to maintain the wiki. It defines page formats, ingest/query/lint workflows, naming conventions, and log format. This is the key configuration file. Edit it to customize behavior for your domain.

What Makes This Different from RAG

RAG LLM Wiki Agent
Re-derives knowledge every query Compiles once, keeps current
Raw chunks as retrieval unit Structured wiki pages
No cross-references Cross-references pre-built
Contradictions surface at query time (maybe) Flagged at ingest time
No accumulation Every source makes the wiki richer

Use Cases

  • Research — go deep on a topic over weeks; every paper/article updates the same wiki
  • Reading — build a companion wiki as you read a book; by the end you have a rich reference
  • Personal knowledge — file journal entries, health notes, goals; build a structured picture over time
  • Business — feed in meeting transcripts, Slack threads, docs; LLM does the maintenance no one wants to do

Tips

  • Use Obsidian to read/browse the wiki — follow links, check graph view
  • Use Obsidian Web Clipper to clip web articles directly to raw/
  • The wiki is a git repo — you get version history for free
  • File good query answers back with --save — your explorations compound just like ingested sources

License

MIT License — see LICENSE for details.

Related

Reviews (0)

No results found