Name: ivygrep
Author: bvolpato

Semantic code search that never uploads your code.
Ask questions in English. Get answers in code. Local inference.

Website · Benchmarks · Architecture · Contributing

⚡ Quick Start

Install via Homebrew (recommended):

brew tap bvolpato/tap
brew install bvolpato/tap/ivygrep

Install a pre-built binary:

tag=$(curl -fsSL https://api.github.com/repos/bvolpato/ivygrep/releases/latest | sed -n 's/.*"tag_name": "\(v[^"]*\)".*/\1/p')
case "$(uname -s)-$(uname -m)" in
  Linux-x86_64) target=linux-x86_64-musl ;;
  Linux-aarch64|Linux-arm64) target=linux-aarch64-musl ;;
  Darwin-x86_64) target=macos-x86_64 ;;
  Darwin-arm64) target=macos-aarch64 ;;
  *) echo "unsupported platform: $(uname -s)-$(uname -m)" >&2; exit 1 ;;
esac
curl -fsSL "https://github.com/bvolpato/ivygrep/releases/download/${tag}/ivygrep-${tag}-${target}.tar.gz" \
  | tar xz --strip-components=1 "ivygrep-${tag}-${target}/ig"
mkdir -p ~/.local/bin
install -m 0755 ig ~/.local/bin/ig

Windows x86_64 releases include ig.exe in
ivygrep-${tag}-windows-x86_64.zip. Extract it into a directory on PATH.
Windows uses the dependency-light portable vector backend; Linux and macOS use
the faster USearch backend.

Build from source:

git clone https://github.com/bvolpato/ivygrep.git && cd ivygrep
./build.sh
install -m 0755 ./target/release/ig ~/.local/bin/ig

Developer targets:

./build.sh --help
./test.sh --help
./bench.sh --help

./build.sh          # release binary
./build.sh --features accelerate,metal  # opt-in macOS Metal neural inference
./build.sh --features cuda  # opt-in Linux CUDA neural inference
./test.sh --quick   # fast local check
./test.sh           # fmt, clippy, unit/integration tests
./bench.sh          # critical Criterion benchmark, no stale local baseline comparison

Your first search:

ig "authentication flow"            # auto-indexes on first run, then searches
ig "error handling" src/api/         # scope to a directory
ig --all "database migrations"      # search across all indexed projects

That's it. No config files, no setup wizards, no prompts, no API keys. On first run, ig auto-indexes the workspace and spawns a background daemon for incremental updates. Neural mode downloads its model artifacts into the Hugging Face cache on first use; --hash and hash-only builds require no model download.

ivygrep demo — searching the opencode repo

🤖 MCP Server — Supercharge Your AI Agent

ivygrep is the retrieval layer your coding agent is missing. Instead of stuffing entire files into context, your agent pulls only the relevant code chunks natively.

ig --mcp    # starts MCP server on stdio

One-line setup for agents:

Claude Code

claude mcp add -s user ig -- ig --mcp

Or add to ~/.claude.json:

{
  "mcpServers": {
    "ig": { "type": "stdio", "command": "ig", "args": ["--mcp"] }
  }
}

Cursor

Add to .cursor/mcp.json or ~/.cursor/mcp.json:

{
  "mcpServers": {
    "ig": { "command": "ig", "args": ["--mcp"] }
  }
}

Then refresh MCP servers in Cursor settings.

Gemini

gemini mcp add --scope user --transport stdio ig ig --mcp

Or add to ~/.gemini/settings.json:

{
  "mcpServers": {
    "ig": { "command": "ig", "args": ["--mcp"] }
  }
}

OpenCode & Codex

OpenCode: opencode mcp add -> Choose Local and set command to ig --mcp.

Codex: Run codex mcp add ig -- ig --mcp or add to ~/.codex/config.toml.

🤔 What is ivygrep?

ivygrep (ig) is a local-first code search tool that understands natural language. It combines lexical search (like grep/rg) with semantic vector search — so you can search your code the way you think about it.

Traditional tools require you to know exactly what you're looking for. ivygrep lets you search with intent.

Feature	`grep` / `rg`	GitHub Search	ivygrep
Works offline	✅	❌	✅
Natural language queries	❌	⚠️	✅
Semantic understanding	❌	❌	✅
Warm indexed query latency	✅	❌	✅
Privacy-first (no upload)	✅	❌	✅
Git-native (worktrees, branches)	❌	❌	✅
Structural code chunking	❌	❌	✅
Incremental indexing	❌	❌	✅
MCP server for AI agents	❌	❌	✅

🌍 45 Language/File Types Supported

ivygrep indexes and structurally chunks 45 language/file types today:

Tree-sitter AST chunking (21 languages): Rust, Python, Go, JavaScript, TypeScript/TSX, Java, C, C++, C#, Scala, PHP, Ruby, Swift, Bash, Haskell, OCaml, Lua, Dart, Objective-C, Perl, Starlark macros and targets in very large BUILD-like sources
Heuristic structural chunking: the remaining supported languages below
Systems: Rust, C, C++, Zig, Nim
Backend: Python, Go, Java, Kotlin, Scala, C#, Ruby, PHP, Perl, Groovy
Web & Mobile: JavaScript, TypeScript, HTML, CSS, GraphQL, Swift, Dart, Objective-C
Functional: Haskell, OCaml, Elixir, Erlang, Clojure
Data, Scripting & Config: R, Julia, Bash/Shell, PowerShell, Lua, SQL, Protobuf, Thrift, Terraform, Starlark/Bazel, Dockerfile, Makefile, Markdown, XML, TOML/YAML/INI/env config, JSON, plain text

Unknown extensions are auto-detected and indexed as text.

🚀 Performance & Speed

Fresh release-readiness validation used a Linux kernel checkout with 93,502 indexed files and 4,419,660 chunks:

Scenario	Metric	Result
Fresh lexical-first Linux kernel index	full rebuild	~270 sec
Large-repo natural query	process-cold p95	~137 ms
Warm daemon identical-query replay	end-to-end p95	~79 ms
Warm daemon distinct queries	end-to-end p95	~116 ms
Portable Linux intent relevance	13 labeled queries	41.20
Best retained dedicated-host daemon run	identical-query p95	~4.9 ms
Historical eager-vector Linux kernel index	full rebuild	~27.3 min
Lexical-first scoped stress probe	10,501 files	~3 sec
Warm daemon correctness guard	daemon/local hits	20 / 20

The daemon benchmark now reports warmed distinct-query latency separately from
identical-query cache replay. Latency depends on CPU, storage, and virtualization;
the single-digit result is a retained dedicated-host benchmark, not a universal
claim. Benchmark writeups and charts live under
docs/benchmarks/.

Indexing commits BM25/literal search first. A load-aware background subprocess
builds hash ANN vectors, then upgrades to local neural vectors with Candle
(AllMiniLML6V2). Set IVYGREP_MODEL_PROFILE=code to opt into the pinned,
CodeSearchNet-trained code-minilm-l6-v1 profile. Model identity is persisted
with the index so incompatible vectors are rebuilt rather than silently reused.
macOS release builds use Accelerate-backed CPU math; Metal is available as an
opt-in local build while its background throughput is tuned.

Relevance evaluation separates foreground readiness from post-background hash
quality:

uv run scripts/eval_relevance.py
uv run scripts/eval_relevance.py --enhance-hash
python3 scripts/eval_code_retrieval.py \
  --dataset tests/fixtures/retrieval \
  --binary target/release/ig \
  --mode hash

🏗️ Architecture & Git-Native Intelligence

ivygrep deeply understands git. This is a core design decision, not an afterthought:

Worktree overlays: Doesn't duplicate indexes contextually. Creates thin overlays mapping divergent chunks.
Branch-switch deltas: Targets Merkle-diff re-indexes of only changed files upon branch switch.
Content-based deduplication: Byte-identical files are never re-indexed across branches.
.gitignore native: Respects rules automatically at every level.

Tech stack: tantivy (BM25), usearch plus a pure-Rust portable vector
backend, tree-sitter (AST), SQLite symbol/call graph storage,
candle_embed / candle-core (local neural embeddings), and xxh3 hashes.

🔒 Security & Privacy

ivygrep runs search and embedding inference locally and never sends your code, queries, or index data to an external service. A few things worth knowing:

Where data lives: the index stores compressed source chunks under ~/.local/share/ivygrep (or $XDG_DATA_HOME/$IVYGREP_HOME). Unix uses an owner-only 0600 socket plus peer-uid verification. Windows uses loopback TCP with a per-daemon authentication token stored beside the user-owned index. Keep a custom IVYGREP_HOME private to your account.
Model download: neural mode uses hf-hub to download pinned model assets on first use and caches them under $HF_HOME or ~/.cache/huggingface. The default is AllMiniLM-L6-v2; IVYGREP_MODEL_PROFILE=code selects the compact CodeSearchNet-trained profile. Use --hash or a --no-default-features build when no model-network access is permitted.
Inference backend: macOS release binaries execute locally with Accelerate-backed CPU math; portable Linux release binaries execute locally on CPU. Source builds can opt into local Metal with --features accelerate,metal or CUDA with --features cuda on a compatible installation. The CUDA build does not require cuDNN. If nvidia-smi cannot report compute capability, build.sh and test.sh infer CUDA_COMPUTE_CAP=120 for RTX 50/Blackwell hosts; set CUDA_COMPUTE_CAP explicitly for other affected GPUs. ig --status reports the recorded backend that last generated neural vectors.
Secrets in your repo: ivygrep indexes file contents, including config/dotfiles (e.g. .env) unless they're gitignored. Those contents are stored in the local index and can appear in search snippets. Keep secrets out of the workspace or in .gitignore.
MCP scope: the ig_search MCP tool only searches the workspace at the provided path — it cannot search across other indexed projects.

🔧 CLI Reference

# Core workflow
ig "your query"                    # search current workspace
ig "query" ~/other/project         # search a different workspace
ig --add .                         # register & index a workspace
ig --rm .                          # unregister a workspace
ig --status                        # show workspace health & embedding status
ig --doctor                        # inspect index health for the current workspace
ig --doctor --deep                 # run full cross-store integrity scans
ig --doctor --fix                  # rebuild a broken or stale index

# Search modes
ig --interactive "query"             # interactive TUI with file/snippet browsing
ig --literal "fn_name"               # fast exact-match search (index-backed)
ig --lexical-only "query"          # BM25/path/signature retrieval only
ig --hash "query"                  # force hash embeddings (skip neural)
ig --symbol calculate_tax          # exact definitions
ig --refs calculate_tax            # indexed references/calls
ig --callers calculate_tax         # caller chunks

# Output control
ig -n 5 "query"                    # limit to 5 files
ig -C 4 "query"                    # 4 lines of context
ig --type rust "query"             # filter by language
ig --include "*.rs,*.go" "query"   # include globs
ig --exclude "vendor/**" "query"   # exclude globs
ig --json "query"                  # machine-readable JSON
ig --first-line-only "query"       # compact grep-style output
ig --file-name-only "query"        # file paths only

# Daemon & server
ig --daemon                        # start background watcher
ig --mcp                           # start MCP server (stdio)

🧪 Development

./test.sh           # fmt, ShellCheck, clippy, Rust and Python harness tests
./build.sh --locked # release binary, Cargo.lock unchanged
./build.sh --locked --features accelerate,metal  # opt-in macOS Metal neural binary
./build.sh --locked --features cuda  # opt-in Linux CUDA neural binary
./bench.sh          # critical Criterion benchmark, no stale local baseline comparison

The test suite covers unit tests, CLI snapshots, concurrency, golden queries,
public-layout retrieval metrics, symbol/caller indexing, incremental CRUD, MCP,
daemon recovery, git/worktree behavior, property-based Merkle invariants, and
benchmark guards.
Benchmark output reports per-operation latency; short-looking numbers are repeated inside Criterion so actual timed samples remain long enough to be stable.

End-to-end procedures

./build.sh
./scripts/e2e_procedures.sh --binary ./target/release/ig
python3 scripts/check_daemon_equivalence.py \
  --skip-build \
  --binary ./target/release/ig \
  --bench-home /tmp/ivygrep-daemon-equivalence

# Opt-in macOS Metal backend validation (downloads local model artifacts on first run)
./build.sh --locked --features accelerate,metal
./scripts/e2e_neural_backend.sh --binary ./target/release/ig --expect-backend "Candle Metal"

# Opt-in Linux CUDA backend validation (downloads local model artifacts on first run)
./build.sh --locked --features cuda
./scripts/e2e_neural_backend.sh --binary ./target/release/ig --expect-backend "Candle CUDA"

These smoke tests run against throwaway projects and isolated IVYGREP_HOME directories; the neural backend check embeds fixture text locally and verifies recorded backend reporting.

Stress testing

./scripts/bootstrap_stress_fixtures.sh
./test.sh --stress

Roadmap

More Tree-sitter languages: expand the AST pipeline to Kotlin, SQL, and additional grammars as high-quality tree-sitter parsers mature.
Public benchmark expansion: run CoIR and CodeSearchNet baselines regularly and publish comparable quality/latency history.
Learned reranking: evaluate compact local cross-encoders against the bounded deterministic reranker without weakening offline portability.
Editor integrations: VS Code extension and Neovim telescope plugin for in-editor semantic search.
Background job resilience: richer queue diagnostics and resumable worker state across daemon restarts.

Contributing

Contributions are welcome! See CONTRIBUTING.md for guidelines.

Built by @bvolpato · Released under the MIT License