Vault-for-LLM
Health Pass
- License รขโฌโ License: MIT
- Description รขโฌโ Repository has a description
- Active repo รขโฌโ Last push 0 days ago
- Community trust รขโฌโ 33 GitHub stars
Code Pass
- Code scan รขโฌโ Scanned 12 files during light audit, no dangerous patterns found
Permissions Pass
- Permissions รขโฌโ No dangerous permissions requested
This tool is a local-first knowledge management system and MCP server that gives LLM agents persistent, searchable memory using a four-layer architecture with SQLite and ONNX embeddings.
Security Assessment
The tool is designed to run entirely locally with zero cloud dependency. A scan of 12 files found no dangerous patterns, hardcoded secrets, or requests for dangerous permissions. It does not automatically execute unsafe shell commands or make unauthorized network requests. Overall risk is rated as Low.
Quality Assessment
The project is under active development, with its last push occurring today. It uses the permissive MIT license and includes clear, multi-language documentation. The codebase is relatively new but has already garnered 33 GitHub stars, indicating a positive early response from the developer community and moderate community trust.
Verdict
Safe to use.
๐ง Local-first knowledge system for LLM agents โ sqlite-vec + ONNX embeddings, no cloud/Docker/PyTorch dependency
Vault-for-LLM
็น้ซไธญๆ | ็ฎไฝไธญๆ | English
๐ง A local-first, open-source knowledge management system for LLM agents.
Zero cloud dependency. Zero Docker. Zero PyTorch. Justpip installand go.
What is Vault-for-LLM?
Vault-for-LLM is a four-layer hierarchical knowledge base designed to give any LLM agent persistent, searchable memory. It runs entirely locally using SQLite + sqlite-vec + ONNX embeddings.
Key Features
- Four-layer architecture (L0โL3) for structured knowledge injection
- Hybrid search: keyword + semantic vector search (ONNX, no GPU needed)
- Knowledge graph: auto-inferred entities and edges with 2-hop BFS expansion
- Atomic claims with source citations: sub-chunk granularity, every claim traceable to original text
- Self-questioning convergence: system judges if it "knows enough" to explain a topic (KAL-inspired)
- Cross-family LLM validation: extract with one model, verify with another to catch hallucinations
- Freshness tracking + FSRS spaced repetition: automated staleness detection and review scheduling
- AAAK compression: 6x compression for compiled knowledge
- Trust scoring: every knowledge entry has a confidence score (0.0โ1.0)
- Lint & contradiction detection: automatic quality checks
- MCP server: expose your vault to any MCP-compatible AI agent mid-conversation
- CLI-first: 20+ commands for full lifecycle management
Architecture
L0 Identity โ Who the user is (injected every conversation)
L1 Core Facts โ Environment & active projects (injected every conversation)
L2 Context โ Recent decisions & troubleshooting (auto-updated daily)
L3 Deep Knowledge โ Architecture, techniques, lessons (searched on demand)
What's New in v0.4.0
| Feature | Description |
|---|---|
| Convergence Check | KAL-inspired self-questioning loop โ system asks "Can I explain this?" and keeps learning until it can |
| Cross Validation | Asymmetric LLM verification โ extract claims with Model A, verify with Model B |
| Freshness Tracking | Automatic staleness detection + FSRS interval scheduling for knowledge review |
| Atomic Claims | Claims at sub-chunk granularity with source_span citations for precision retrieval |
| Graph Expansion | 2-hop recursive CTE walk through knowledge graph for contextual retrieval |
| MCP Server | Model Context Protocol server โ let any chat AI query and inject knowledge mid-conversation |
| Updated CLI | New commands: vault converge, vault cross-validate, vault freshness |
See CHANGELOG.md for full details.
Quick Start
# Install
pip install -e .
# Initialize a project
vault init
# Add knowledge
vault add "My First Entry" --content "Something I learned today"
# Compile (raw โ database + compiled)
vault compile
# Search
vault search "my query"
# Health check
vault doctor
See INSTALL.md for detailed installation options.
Directory Structure
your-project/
โโโ vault.yaml โ Project config (auto-generated by `vault init`)
โโโ L0-identity/ โ Who the user is (injected every conversation)
โ โโโ identity.md
โโโ L1-core-facts/ โ Core facts (injected every conversation)
โ โโโ current-projects.md
โโโ L2-context/ โ Dynamic context (auto-updated daily)
โ โโโ recent-sessions/
โ โโโ current.md
โโโ L3-knowledge/ โ Deep knowledge (searched on demand)
โโโ raw/ โ Raw knowledge input (your .md files go here)
โโโ compiled/ โ AAAK compressed backup (auto-generated)
โโโ templates/ โ Clean templates for L0/L1/L2
AI Integration Guide
Any LLM Agent (Universal)
- Read this README to understand the architecture
- Read
L0-identity/identity.mdto know the user - Read
L1-core-facts/current-projects.mdfor current state - Use
vault search "query"for semantic search
Claude Code / Cursor / Any AI IDE
- Copy
CLAUDE.md(included) into your project root - For deep knowledge, search
compiled/orraw/ - Use
rg "keyword" raw/ compiled/for fast lookup
MCP Integration (Chat with your vault)
Connect your vault to any MCP-compatible AI agent:
# Install MCP dependencies
pip install "vault-for-llm[mcp]"
# Start the server
vault-mcp --project-dir /path/to/your/project
Now your AI can search, add, and query knowledge mid-conversation โ no manual copy-paste needed.
CLI Reference
| Command | Description |
|---|---|
vault init |
Initialize a new project |
vault doctor |
Health check |
vault add "Title" --content "..." |
Add knowledge entry |
vault add "Title" --file notes.md |
Add from file |
vault import doc.md |
Import long document (auto-chunked) |
vault compile |
Compile raw/ โ database + compiled/ |
vault search "query" |
Search (auto: keyword + semantic) |
vault search "query" --graph-expand 2 |
Search + 2-hop graph expansion |
vault list |
List all entries |
vault stats |
Show database statistics |
vault lint |
Run quality checks |
vault converge |
Self-questioning convergence check |
vault cross-validate |
Cross-family LLM validation |
vault freshness |
Freshness + review scheduling |
vault dedup |
Detect semantic duplicates |
vault dedup --dry-run |
Preview merge plan (no changes) |
vault dedup --merge |
Auto-merge duplicates (keeps higher trust) |
vault graph build |
Build knowledge graph |
vault graph show |
Show graph summary |
vault graph export --format mermaid |
Export graph as Mermaid diagram |
vault graph expand <id> |
Expand from a specific node |
vault config set <key> <value> |
Set config (e.g. embedding provider) |
MCP Server (Claude Code / Cursor / OpenClaw)
Expose your vault directly to any MCP-compatible AI agent:
# Install MCP dependencies
pip install "vault-for-llm[mcp]"
# Start the server (run from your project directory)
vault-mcp
# Or specify path explicitly
vault-mcp --project-dir /path/to/your/project
Add to your Claude Code config (~/.claude/claude_desktop_config.json):
{
"mcpServers": {
"vault": {
"command": "vault-mcp",
"args": ["--project-dir", "/path/to/your/project"]
}
}
}
Available MCP tools: vault_search, vault_add, vault_get, vault_list, vault_stats
Knowledge File Format
All .md files use YAML frontmatter:
---
title: "Knowledge Title"
category: "concept|technique|workflow|lesson|error|comparison"
layer: "L0|L1|L2|L3"
tags: ["tag1", "tag2"]
trust: 0.0-1.0
source: "source-description"
created: "YYYY-MM-DD"
---
Trust Score Guide
| Range | Meaning |
|---|---|
| 0.9+ | Verified by real experience |
| 0.7โ0.8 | High confidence from documentation |
| 0.5โ0.6 | General knowledge, not yet verified |
| < 0.3 | Unverified, needs review |
Compiler
vault compile
What it does:
raw/โ database (upsert by content hash)raw/โcompiled/(AAAK 6x compression)- Extract atomic claims with source_span citations
- Auto L2 update + lint health check + git commit
Tech Stack
| Component | Technology | Why |
|---|---|---|
| Database | SQLite + sqlite-vec | Zero-config, portable, vector search |
| Embeddings | ONNX Runtime (~150MB) | No PyTorch/GPU needed |
| Search | Hybrid (keyword + vector + graph expansion) | Best of both worlds |
| Graph | SQLite (entities + edges + 2-hop CTE) | Lightweight relationship tracking |
| Compression | AAAK format | 6x size reduction |
| Validation | Cross-family LLM + Convergence check | Catch what single models miss |
Requirements
- Python 3.10+
- ~150MB for ONNX embedding model (optional)
- No GPU, no Docker, no cloud account needed
License
MIT License โ see LICENSE.
Built for developers who want their AI agents to actually remember things.
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found