codedb

mcp
Guvenlik Denetimi
Uyari
Health Gecti
  • License — License: BSD-3-Clause
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Community trust — 239 GitHub stars
Code Uyari
  • fs module — File system access in .github/workflows/bench-regression.yml
  • network request — Outbound network request in install/worker.js
  • network request — Outbound network request in website/worker/worker.js
Permissions Gecti
  • Permissions — No dangerous permissions requested
Purpose
This tool provides a code intelligence server and MCP toolset built in Zig. It gives AI agents fast structural indexing, search, and editing capabilities across a codebase.

Security Assessment
Overall risk: Low. The server is designed with good security defaults, explicitly binding its HTTP server to localhost without authentication to prevent external exposure. It does not request dangerous permissions, and no hardcoded secrets were found. There are some outbound network requests, but the automated scan traced these to website and Cloudflare Worker installation files rather than the core application. File system access is limited to a GitHub Actions benchmarking workflow. As an alpha-stage project, be aware that the snapshot format and APIs are still stabilizing, but the core MCP protocol communication over standard input/output is stable.

Quality Assessment
This is a highly active and promising project. It uses the permissive BSD-3-Clause license and is under very active development, with the most recent push occurring today. It has garnered 239 GitHub stars, indicating a solid level of community trust and developer interest for a niche utility. The README is thorough, transparent about its current alpha status, and clearly outlines what features are production-ready versus still in progress.

Verdict
Safe to use.
SUMMARY

Zig code intelligence server and MCP toolset for AI agents. Fast tree, outline, symbol, search, read, edit, deps, snapshot, and remote GitHub repo queries.

README.md

codedb

Release License Zig 0.15 Alpha Ask DeepWiki

codedb

Code intelligence server for AI agents. Zig core. MCP native. Zero dependencies.

Structural indexing · Trigram search · Word index · Dependency graph · File watching · MCP + HTTP

Status · Install · Quick Start · MCP Tools · Benchmarks · Architecture · Data & Privacy · Building


Status

Alpha software — API is stabilizing but may change

codedb works and is used daily in production AI workflows, but:

  • Language support — Zig, Python, TypeScript/JavaScript (more planned)
  • No auth — HTTP server binds to localhost only
  • Snapshot format may change between versions
  • MCP protocol is JSON-RPC 2.0 over stdio (stable)
What works today What's in progress
12 MCP tools for full codebase intelligence Additional language parsers
Trigram-accelerated full-text search WASM target for Cloudflare Workers
O(1) inverted word index for identifier lookup Incremental snapshot updates
Structural outlines (functions, structs, imports) Multi-project support
Reverse dependency graph Remote indexing over SSH
Atomic line-range edits with version tracking
Auto-registration in Claude, Codex, Gemini, Cursor
Polling file watcher with filtered directory walker
Portable snapshot for instant MCP startup
Multi-agent support with file locking + heartbeats
Codesigned + notarized macOS binaries
Cross-platform: macOS (ARM/x86), Linux (ARM/x86)

⚡ Install

curl -fsSL https://codedb.codegraff.com/install.sh | sh

Downloads the binary for your platform and auto-registers codedb as an MCP server in Claude Code, Codex, Gemini CLI, and Cursor.

Platform Binary Signed
macOS ARM64 (Apple Silicon) codedb-darwin-arm64 ✅ codesigned + notarized
macOS x86_64 (Intel) codedb-darwin-x86_64 ✅ codesigned + notarized
Linux ARM64 codedb-linux-arm64
Linux x86_64 codedb-linux-x86_64

Or install manually from GitHub Releases.


⚡ Quick Start

As an MCP server (recommended)

After installing, codedb is automatically registered. Just open a project and the 12 MCP tools are available to your AI agent.

# Manual MCP start (auto-configured by install script)
codedb mcp /path/to/your/project

As an HTTP server

codedb serve /path/to/your/project
# listening on localhost:7719

CLI

codedb tree /path/to/project          # file tree with symbol counts
codedb outline src/main.zig           # symbols in a file
codedb find AgentRegistry             # find symbol definitions
codedb search "handleAuth"            # full-text search (trigram-accelerated)
codedb word Store                     # exact word lookup (inverted index, O(1))
codedb hot                            # recently modified files

🔧 MCP Tools

12 tools over the Model Context Protocol (JSON-RPC 2.0 over stdio):

Tool Description
codedb_tree Full file tree with language, line counts, symbol counts
codedb_outline Symbols in a file: functions, structs, imports, with line numbers
codedb_symbol Find where a symbol is defined across the codebase
codedb_search Trigram-accelerated full-text search
codedb_word O(1) inverted index word lookup
codedb_hot Most recently modified files
codedb_deps Reverse dependency graph (which files import this file)
codedb_read Read file content
codedb_edit Apply line-range edits (atomic writes)
codedb_changes Changed files since a sequence number
codedb_status Index status (file count, current sequence)
codedb_snapshot Full pre-rendered JSON snapshot of the codebase

Example: agent explores a codebase

# 1. Get the file tree
curl localhost:7719/tree
# → src/main.zig      (zig, 55L, 4 symbols)
#   src/store.zig     (zig, 156L, 12 symbols)
#   src/agent.zig     (zig, 135L, 8 symbols)

# 2. Drill into a file
curl "localhost:7719/outline?path=src/store.zig"
# → L20: struct_def Store
#   L30: function init
#   L55: function recordSnapshot

# 3. Find a symbol across the codebase
curl "localhost:7719/symbol?name=AgentRegistry"
# → {"path":"src/agent.zig","line":30,"kind":"struct_def"}

# 4. Full-text search
curl "localhost:7719/search?q=handleAuth&max=10"

# 5. Check what changed
curl "localhost:7719/changes?since=42"

📊 Benchmarks

Measured on Apple M4 Pro, 48GB RAM. MCP = pre-indexed warm queries (20 iterations avg). CLI/external tools include process startup (3 iterations avg). Ground truth verified against Python reference implementation.

Latency — codedb MCP vs codedb CLI vs ast-grep vs ripgrep vs grep

codedb repo (20 files, 12.6k lines):

Query codedb MCP codedb CLI ast-grep ripgrep grep MCP speedup
File tree 0.04 ms 52.9 ms 1,253x vs CLI
Symbol search (init) 0.10 ms 54.1 ms 3.2 ms 6.3 ms 6.5 ms 549x vs CLI
Full-text search (allocator) 0.05 ms 60.7 ms 3.2 ms 5.3 ms 6.6 ms 1,340x vs CLI
Word index (self) 0.04 ms 59.7 ms n/a 7.2 ms 6.5 ms 1,404x vs CLI
Structural outline 0.05 ms 53.5 ms 3.1 ms 2.4 ms 1,143x vs CLI
Dependency graph 0.05 ms 2.2 ms n/a n/a n/a 45x vs CLI

merjs repo (100 files, 17.3k lines):

Query codedb MCP codedb CLI ast-grep ripgrep grep MCP speedup
File tree 0.05 ms 54.0 ms 1,173x vs CLI
Symbol search (init) 0.07 ms 54.4 ms 3.4 ms 6.3 ms 3.6 ms 758x vs CLI
Full-text search (allocator) 0.03 ms 54.1 ms 2.9 ms 5.1 ms 3.7 ms 1,554x vs CLI
Word index (self) 0.04 ms 54.7 ms n/a 6.3 ms 4.2 ms 1,518x vs CLI
Structural outline 0.04 ms 54.9 ms 3.4 ms 2.5 ms 1,243x vs CLI
Dependency graph 0.05 ms 1.9 ms n/a n/a n/a 41x vs CLI

Token Efficiency

codedb returns structured, relevant results — not raw line dumps. For AI agents, this means dramatically fewer tokens per query:

Repo codedb MCP ripgrep / grep Reduction
codedb (search allocator) ~20 tokens ~32,564 tokens 1,628x fewer
merjs (search allocator) ~20 tokens ~4,007 tokens 200x fewer

Indexing Speed

codedb builds all indexes on startup (outlines, trigram, word, dependency graph) — not just a parse tree:

Repo Files Lines Cold start Per file
codedb 20 12.6k 17 ms 0.85 ms
merjs 100 17.3k 16 ms 0.16 ms
openclaw/openclaw 11,281 2.29M 2.9 s 6.66 ms
vitessio/vitess 5,028 2.18M ~2 s 0.40 ms
Indexes are built once on startup. After that, the file watcher keeps them updated incrementally (single-file re-index: <2ms). Queries never re-scan the filesystem.

Why codedb is fast

  • MCP server indexes once on startup → all queries hit in-memory data structures (O(1) hash lookups)
  • CLI pays ~55ms process startup + full filesystem scan on every invocation
  • ast-grep re-parses all files through tree-sitter on every call (~3ms)
  • ripgrep/grep brute-force scan every file on every call (~5-7ms)
  • The MCP advantage: index once, query thousands of times at sub-millisecond latency

Feature Matrix

Feature codedb MCP codedb CLI ast-grep ripgrep grep ctags
Structural parsing
Trigram search index
Inverted word index
Dependency graph
Version tracking
Multi-agent locking
Pre-indexed (warm)
No process startup
MCP protocol
Full-text search
Atomic file edits
File watcher

codedb = tree-sitter + search index + dependency graph + agent runtime. Zero external dependencies. Pure Zig. Single binary.


🏗️ Architecture

┌─────────────┐     ┌─────────────┐
│  HTTP :7719 │     │  MCP stdio  │
│  server.zig │     │  mcp.zig    │
└──────┬──────┘     └──────┬──────┘
       │                   │
       └───────┬───────────┘
               │
    ┌──────────▼──────────┐
    │     Explorer        │
    │   explore.zig       │
    │  ┌───────────────┐  │
    │  │ WordIndex      │  │
    │  │ TrigramIndex   │  │
    │  │ Outlines       │  │
    │  │ Contents       │  │
    │  │ DepGraph       │  │
    │  └───────────────┘  │
    └──────────┬──────────┘
               │
    ┌──────────▼──────────┐
    │      Store          │──── data.log
    │    store.zig        │
    └──────────┬──────────┘
               │
    ┌──────────▼──────────┐
    │     Watcher         │ ← polls every 2s
    │   watcher.zig       │
    │  (FilteredWalker)   │
    └─────────────────────┘

No SQLite. No dependencies. Purpose-built data model:

  • Explorer — structural index engine. Parses Zig, Python, TypeScript/JavaScript. Maintains outlines, trigram index, inverted word index, content cache, and dependency graph behind a single mutex.
  • Store — append-only version log. Every mutation (snapshot, edit, delete) gets a monotonically increasing sequence number. Version history capped at 100 per file.
  • Watcher — polling file watcher (2s interval). FilteredWalker prunes .git, node_modules, zig-cache, __pycache__, etc. before descending.
  • Agents — first-class structs with cursors, heartbeats, and exclusive file locks. Stale agents reaped after 30s.

Threading Model

Thread Role
Main HTTP accept loop or MCP read loop
Watcher Polls filesystem every 2s via FilteredWalker
ISR Rebuilds snapshot when stale flag is set
Reap Cleans up stale agents every 5s
Per-connection HTTP server spawns a thread per connection

All threads share a shutdown: atomic.Value(bool) for graceful termination.


🔒 Data & Privacy

codedb collects anonymous usage telemetry to improve the tool. Telemetry is written to ~/.codedb/telemetry.ndjson and synced to the codedb analytics endpoint on session close. No source code, file contents, file paths, or search queries are collected — only aggregate tool call counts, latency, and startup stats.

Location Contents Purpose
~/.codedb/projects/<hash>/ Trigram index, frequency table, data log Persistent index cache
~/.codedb/telemetry.ndjson Aggregate tool calls and startup stats Local telemetry log
./codedb.snapshot File tree, outlines, content, frequency table Portable snapshot for instant MCP startup

Not stored: No source code is sent anywhere. No file contents, file paths, or search queries are collected in telemetry. Sensitive files auto-excluded (.env*, credentials.json, secrets.*, .pem, .key, SSH keys, AWS configs).

To disable the local telemetry log entirely, set CODEDB_NO_TELEMETRY=1.

To sync the local NDJSON file into Postgres for analysis or dashboards, use scripts/sync-telemetry.py with the schema in docs/telemetry/postgres-schema.sql. The data flow is documented in docs/telemetry.md.

rm -rf ~/.codedb/          # clear all cached indexes
rm -f codedb.snapshot      # remove snapshot from project

🔨 Building from Source

Requirements: Zig 0.15+

git clone https://github.com/justrach/codedb.git
cd codedb
zig build                              # debug build
zig build -Doptimize=ReleaseFast       # release build
zig build test                         # run tests
zig build bench                        # run benchmarks

Binary: zig-out/bin/codedb

Cross-compilation

zig build -Doptimize=ReleaseFast -Dtarget=x86_64-linux
zig build -Doptimize=ReleaseFast -Dtarget=aarch64-linux
zig build -Doptimize=ReleaseFast -Dtarget=x86_64-macos

Releasing

./release.sh 0.2.0              # build, codesign, notarize, upload to GitHub Releases
./release.sh 0.2.0 --dry-run    # preview without executing

License

See LICENSE for details.

Yorumlar (0)

Sonuc bulunamadi