codedb
Health Gecti
- License — License: BSD-3-Clause
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Community trust — 239 GitHub stars
Code Uyari
- fs module — File system access in .github/workflows/bench-regression.yml
- network request — Outbound network request in install/worker.js
- network request — Outbound network request in website/worker/worker.js
Permissions Gecti
- Permissions — No dangerous permissions requested
This tool provides a code intelligence server and MCP toolset built in Zig. It gives AI agents fast structural indexing, search, and editing capabilities across a codebase.
Security Assessment
Overall risk: Low. The server is designed with good security defaults, explicitly binding its HTTP server to localhost without authentication to prevent external exposure. It does not request dangerous permissions, and no hardcoded secrets were found. There are some outbound network requests, but the automated scan traced these to website and Cloudflare Worker installation files rather than the core application. File system access is limited to a GitHub Actions benchmarking workflow. As an alpha-stage project, be aware that the snapshot format and APIs are still stabilizing, but the core MCP protocol communication over standard input/output is stable.
Quality Assessment
This is a highly active and promising project. It uses the permissive BSD-3-Clause license and is under very active development, with the most recent push occurring today. It has garnered 239 GitHub stars, indicating a solid level of community trust and developer interest for a niche utility. The README is thorough, transparent about its current alpha status, and clearly outlines what features are production-ready versus still in progress.
Verdict
Safe to use.
Zig code intelligence server and MCP toolset for AI agents. Fast tree, outline, symbol, search, read, edit, deps, snapshot, and remote GitHub repo queries.
codedb
Code intelligence server for AI agents. Zig core. MCP native. Zero dependencies.
Structural indexing · Trigram search · Word index · Dependency graph · File watching · MCP + HTTP
Status · Install · Quick Start · MCP Tools · Benchmarks · Architecture · Data & Privacy · Building
Status
Alpha software — API is stabilizing but may change
codedb works and is used daily in production AI workflows, but:
- Language support — Zig, Python, TypeScript/JavaScript (more planned)
- No auth — HTTP server binds to localhost only
- Snapshot format may change between versions
- MCP protocol is JSON-RPC 2.0 over stdio (stable)
| What works today | What's in progress |
|---|---|
| 12 MCP tools for full codebase intelligence | Additional language parsers |
| Trigram-accelerated full-text search | WASM target for Cloudflare Workers |
| O(1) inverted word index for identifier lookup | Incremental snapshot updates |
| Structural outlines (functions, structs, imports) | Multi-project support |
| Reverse dependency graph | Remote indexing over SSH |
| Atomic line-range edits with version tracking | |
| Auto-registration in Claude, Codex, Gemini, Cursor | |
| Polling file watcher with filtered directory walker | |
| Portable snapshot for instant MCP startup | |
| Multi-agent support with file locking + heartbeats | |
| Codesigned + notarized macOS binaries | |
| Cross-platform: macOS (ARM/x86), Linux (ARM/x86) |
⚡ Install
curl -fsSL https://codedb.codegraff.com/install.sh | sh
Downloads the binary for your platform and auto-registers codedb as an MCP server in Claude Code, Codex, Gemini CLI, and Cursor.
| Platform | Binary | Signed |
|---|---|---|
| macOS ARM64 (Apple Silicon) | codedb-darwin-arm64 |
✅ codesigned + notarized |
| macOS x86_64 (Intel) | codedb-darwin-x86_64 |
✅ codesigned + notarized |
| Linux ARM64 | codedb-linux-arm64 |
— |
| Linux x86_64 | codedb-linux-x86_64 |
— |
Or install manually from GitHub Releases.
⚡ Quick Start
As an MCP server (recommended)
After installing, codedb is automatically registered. Just open a project and the 12 MCP tools are available to your AI agent.
# Manual MCP start (auto-configured by install script)
codedb mcp /path/to/your/project
As an HTTP server
codedb serve /path/to/your/project
# listening on localhost:7719
CLI
codedb tree /path/to/project # file tree with symbol counts
codedb outline src/main.zig # symbols in a file
codedb find AgentRegistry # find symbol definitions
codedb search "handleAuth" # full-text search (trigram-accelerated)
codedb word Store # exact word lookup (inverted index, O(1))
codedb hot # recently modified files
🔧 MCP Tools
12 tools over the Model Context Protocol (JSON-RPC 2.0 over stdio):
| Tool | Description |
|---|---|
codedb_tree |
Full file tree with language, line counts, symbol counts |
codedb_outline |
Symbols in a file: functions, structs, imports, with line numbers |
codedb_symbol |
Find where a symbol is defined across the codebase |
codedb_search |
Trigram-accelerated full-text search |
codedb_word |
O(1) inverted index word lookup |
codedb_hot |
Most recently modified files |
codedb_deps |
Reverse dependency graph (which files import this file) |
codedb_read |
Read file content |
codedb_edit |
Apply line-range edits (atomic writes) |
codedb_changes |
Changed files since a sequence number |
codedb_status |
Index status (file count, current sequence) |
codedb_snapshot |
Full pre-rendered JSON snapshot of the codebase |
Example: agent explores a codebase
# 1. Get the file tree
curl localhost:7719/tree
# → src/main.zig (zig, 55L, 4 symbols)
# src/store.zig (zig, 156L, 12 symbols)
# src/agent.zig (zig, 135L, 8 symbols)
# 2. Drill into a file
curl "localhost:7719/outline?path=src/store.zig"
# → L20: struct_def Store
# L30: function init
# L55: function recordSnapshot
# 3. Find a symbol across the codebase
curl "localhost:7719/symbol?name=AgentRegistry"
# → {"path":"src/agent.zig","line":30,"kind":"struct_def"}
# 4. Full-text search
curl "localhost:7719/search?q=handleAuth&max=10"
# 5. Check what changed
curl "localhost:7719/changes?since=42"
📊 Benchmarks
Measured on Apple M4 Pro, 48GB RAM. MCP = pre-indexed warm queries (20 iterations avg). CLI/external tools include process startup (3 iterations avg). Ground truth verified against Python reference implementation.
Latency — codedb MCP vs codedb CLI vs ast-grep vs ripgrep vs grep
codedb repo (20 files, 12.6k lines):
| Query | codedb MCP | codedb CLI | ast-grep | ripgrep | grep | MCP speedup |
|---|---|---|---|---|---|---|
| File tree | 0.04 ms | 52.9 ms | — | — | — | 1,253x vs CLI |
Symbol search (init) |
0.10 ms | 54.1 ms | 3.2 ms | 6.3 ms | 6.5 ms | 549x vs CLI |
Full-text search (allocator) |
0.05 ms | 60.7 ms | 3.2 ms | 5.3 ms | 6.6 ms | 1,340x vs CLI |
Word index (self) |
0.04 ms | 59.7 ms | n/a | 7.2 ms | 6.5 ms | 1,404x vs CLI |
| Structural outline | 0.05 ms | 53.5 ms | 3.1 ms | — | 2.4 ms | 1,143x vs CLI |
| Dependency graph | 0.05 ms | 2.2 ms | n/a | n/a | n/a | 45x vs CLI |
merjs repo (100 files, 17.3k lines):
| Query | codedb MCP | codedb CLI | ast-grep | ripgrep | grep | MCP speedup |
|---|---|---|---|---|---|---|
| File tree | 0.05 ms | 54.0 ms | — | — | — | 1,173x vs CLI |
Symbol search (init) |
0.07 ms | 54.4 ms | 3.4 ms | 6.3 ms | 3.6 ms | 758x vs CLI |
Full-text search (allocator) |
0.03 ms | 54.1 ms | 2.9 ms | 5.1 ms | 3.7 ms | 1,554x vs CLI |
Word index (self) |
0.04 ms | 54.7 ms | n/a | 6.3 ms | 4.2 ms | 1,518x vs CLI |
| Structural outline | 0.04 ms | 54.9 ms | 3.4 ms | — | 2.5 ms | 1,243x vs CLI |
| Dependency graph | 0.05 ms | 1.9 ms | n/a | n/a | n/a | 41x vs CLI |
Token Efficiency
codedb returns structured, relevant results — not raw line dumps. For AI agents, this means dramatically fewer tokens per query:
| Repo | codedb MCP | ripgrep / grep | Reduction |
|---|---|---|---|
codedb (search allocator) |
~20 tokens | ~32,564 tokens | 1,628x fewer |
merjs (search allocator) |
~20 tokens | ~4,007 tokens | 200x fewer |
Indexing Speed
codedb builds all indexes on startup (outlines, trigram, word, dependency graph) — not just a parse tree:
| Repo | Files | Lines | Cold start | Per file |
|---|---|---|---|---|
| codedb | 20 | 12.6k | 17 ms | 0.85 ms |
| merjs | 100 | 17.3k | 16 ms | 0.16 ms |
| openclaw/openclaw | 11,281 | 2.29M | 2.9 s | 6.66 ms |
| vitessio/vitess | 5,028 | 2.18M | ~2 s | 0.40 ms |
| Indexes are built once on startup. After that, the file watcher keeps them updated incrementally (single-file re-index: <2ms). Queries never re-scan the filesystem. |
Why codedb is fast
- MCP server indexes once on startup → all queries hit in-memory data structures (O(1) hash lookups)
- CLI pays ~55ms process startup + full filesystem scan on every invocation
- ast-grep re-parses all files through tree-sitter on every call (~3ms)
- ripgrep/grep brute-force scan every file on every call (~5-7ms)
- The MCP advantage: index once, query thousands of times at sub-millisecond latency
Feature Matrix
| Feature | codedb MCP | codedb CLI | ast-grep | ripgrep | grep | ctags |
|---|---|---|---|---|---|---|
| Structural parsing | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ |
| Trigram search index | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ |
| Inverted word index | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ |
| Dependency graph | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ |
| Version tracking | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ |
| Multi-agent locking | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ |
| Pre-indexed (warm) | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
| No process startup | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
| MCP protocol | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
| Full-text search | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
| Atomic file edits | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ |
| File watcher | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ |
codedb = tree-sitter + search index + dependency graph + agent runtime. Zero external dependencies. Pure Zig. Single binary.
🏗️ Architecture
┌─────────────┐ ┌─────────────┐
│ HTTP :7719 │ │ MCP stdio │
│ server.zig │ │ mcp.zig │
└──────┬──────┘ └──────┬──────┘
│ │
└───────┬───────────┘
│
┌──────────▼──────────┐
│ Explorer │
│ explore.zig │
│ ┌───────────────┐ │
│ │ WordIndex │ │
│ │ TrigramIndex │ │
│ │ Outlines │ │
│ │ Contents │ │
│ │ DepGraph │ │
│ └───────────────┘ │
└──────────┬──────────┘
│
┌──────────▼──────────┐
│ Store │──── data.log
│ store.zig │
└──────────┬──────────┘
│
┌──────────▼──────────┐
│ Watcher │ ← polls every 2s
│ watcher.zig │
│ (FilteredWalker) │
└─────────────────────┘
No SQLite. No dependencies. Purpose-built data model:
- Explorer — structural index engine. Parses Zig, Python, TypeScript/JavaScript. Maintains outlines, trigram index, inverted word index, content cache, and dependency graph behind a single mutex.
- Store — append-only version log. Every mutation (snapshot, edit, delete) gets a monotonically increasing sequence number. Version history capped at 100 per file.
- Watcher — polling file watcher (2s interval).
FilteredWalkerprunes.git,node_modules,zig-cache,__pycache__, etc. before descending. - Agents — first-class structs with cursors, heartbeats, and exclusive file locks. Stale agents reaped after 30s.
Threading Model
| Thread | Role |
|---|---|
| Main | HTTP accept loop or MCP read loop |
| Watcher | Polls filesystem every 2s via FilteredWalker |
| ISR | Rebuilds snapshot when stale flag is set |
| Reap | Cleans up stale agents every 5s |
| Per-connection | HTTP server spawns a thread per connection |
All threads share a shutdown: atomic.Value(bool) for graceful termination.
🔒 Data & Privacy
codedb collects anonymous usage telemetry to improve the tool. Telemetry is written to ~/.codedb/telemetry.ndjson and synced to the codedb analytics endpoint on session close. No source code, file contents, file paths, or search queries are collected — only aggregate tool call counts, latency, and startup stats.
| Location | Contents | Purpose |
|---|---|---|
~/.codedb/projects/<hash>/ |
Trigram index, frequency table, data log | Persistent index cache |
~/.codedb/telemetry.ndjson |
Aggregate tool calls and startup stats | Local telemetry log |
./codedb.snapshot |
File tree, outlines, content, frequency table | Portable snapshot for instant MCP startup |
Not stored: No source code is sent anywhere. No file contents, file paths, or search queries are collected in telemetry. Sensitive files auto-excluded (.env*, credentials.json, secrets.*, .pem, .key, SSH keys, AWS configs).
To disable the local telemetry log entirely, set CODEDB_NO_TELEMETRY=1.
To sync the local NDJSON file into Postgres for analysis or dashboards, use scripts/sync-telemetry.py with the schema in docs/telemetry/postgres-schema.sql. The data flow is documented in docs/telemetry.md.
rm -rf ~/.codedb/ # clear all cached indexes
rm -f codedb.snapshot # remove snapshot from project
🔨 Building from Source
Requirements: Zig 0.15+
git clone https://github.com/justrach/codedb.git
cd codedb
zig build # debug build
zig build -Doptimize=ReleaseFast # release build
zig build test # run tests
zig build bench # run benchmarks
Binary: zig-out/bin/codedb
Cross-compilation
zig build -Doptimize=ReleaseFast -Dtarget=x86_64-linux
zig build -Doptimize=ReleaseFast -Dtarget=aarch64-linux
zig build -Doptimize=ReleaseFast -Dtarget=x86_64-macos
Releasing
./release.sh 0.2.0 # build, codesign, notarize, upload to GitHub Releases
./release.sh 0.2.0 --dry-run # preview without executing
License
See LICENSE for details.
Yorumlar (0)
Yorum birakmak icin giris yap.
Yorum birakSonuc bulunamadi