xgrep

mcp
Security Audit
Fail
Health Pass
  • License — License: MIT
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Community trust — 11 GitHub stars
Code Fail
  • rm -rf — Recursive force deletion command in bench/docker-bench.sh
  • rm -rf — Recursive force deletion command in bench/fair-bench.sh
  • rm -rf — Recursive force deletion command in benchmarks/bench_find.sh
Permissions Pass
  • Permissions — No dangerous permissions requested

No AI report is available for this listing yet.

SUMMARY

Ultra-fast indexed code search engine — faster than ripgrep on repeated searches, lighter than zoekt. Built-in MCP server for AI agents. Also does fast file finding (fd/find alternative).

README.md

xgrep

CI
Crates.io
License: MIT

Ultra-fast indexed code search engine with MCP server for AI coding tools.

Pre-builds a trigram inverted index, then searches in milliseconds. Designed for repeated searches on large codebases — by humans and AI agents alike.

Features

  • Indexed search — trigram inverted index makes repeated searches 2-46x faster than ripgrep
  • File discovery--find mode locates files 2-15x faster than fd
  • MCP server — built-in Model Context Protocol server for AI coding tools (Claude Code, Cursor, etc.)
  • LLM-optimized output--format llm produces Markdown with language tags, context lines, and token-aware truncation
  • Git-aware — search only changed files (--changed), recent commits (--since 1h), respects .gitignore
  • Zero configcargo install xgrep-search, then xg "pattern". Index builds automatically on first search
  • Hybrid search — serves results from index instantly while rebuilding in the background

Why xgrep?

ripgrep zoekt xgrep
Setup None Server required None (cargo install)
First search Instant After server start Auto-builds index
Repeated search (Linux kernel) 1,687ms 170ms (server) 37ms
File discovery (next.js, 26K files) N/A N/A 9ms (fd: 191ms)
Index size N/A 155% of source 8% of source
AI agent integration None None MCP server built-in
Memory (search) 11MB 288MB 208MB

xgrep is not a ripgrep replacement. Use ripgrep for one-off searches. Use xgrep when you search the same codebase repeatedly — the index pays for itself after ~2 searches.

Quick Start

cargo install xgrep-search    # Installs the `xg` command
xg "pattern"                  # Search (auto-builds index on first run)

Requires Rust 1.85+. Works on macOS, Linux, and Windows.

Build from source
git clone https://github.com/momokun7/xgrep.git
cd xgrep/rust
cargo build --release
cp target/release/xg ~/.local/bin/

Usage

xg "pattern"                  # Smart-case search (all-lowercase = case-insensitive)
xg "Pattern"                  # Mixed/upper case in pattern = case-sensitive
xg "pattern" -i               # Force case-insensitive
xg "pattern" -s               # Force case-sensitive (disable smart-case)
xg "pattern" /path/to/repo    # Search a specific directory
xg -e "handle_\w+"            # Regex search
xg "pattern" -w               # Match whole words only
xg "pattern" -t rs            # Filter by file type
xg "pattern" -C 3             # Context lines (symmetric)
xg "pattern" -A 2 -B 1        # 2 lines after, 1 line before
xg "pattern" -g "*.rs"        # Include only paths matching glob (repeatable)
xg "pattern" -g "!*_test.rs"  # Exclude paths matching glob (! prefix)
xg "pattern" --format llm     # Markdown output for LLMs
xg "pattern" --changed        # Only git changed files
xg "pattern" --exclude vendor  # Exclude paths containing "vendor"
xg "pattern" --absolute-paths # Show absolute paths
xg "pattern" --no-hints       # Suppress regex pattern hints
xg --find "*.rs"              # Find files by glob pattern
xg --list-types               # Show supported file types
xg status                     # Show index status
xg init                       # Explicitly rebuild index

Search is smart-case by default: an all-lowercase pattern matches case-insensitively, while any uppercase letter makes the search case-sensitive. Use -i or -s to override (priority: -i > -s > smart-case).

Environment Variables

Variable Description Default
XGREP_LLM_CONTEXT Default context lines for --format llm 3
XGREP_ABSOLUTE_PATHS Set to 1 to always use absolute paths unset
XGREP_NO_HINTS Set to 1 to suppress regex pattern hints unset

Run xg --help for all options.

MCP Server

xgrep runs as an MCP server, giving AI coding tools fast indexed search.

xg serve                        # Start MCP server
xg serve --root /path/to/repo   # Specific directory

Claude Code

{
  "mcpServers": {
    "xgrep": {
      "command": "xg",
      "args": ["serve"]
    }
  }
}

Available tools: search, find_definitions, read_file, index_status, build_index

See docs/agents.md for agent-oriented usage patterns.

Performance

Benchmarked with hyperfine on Apple M4, 32GB RAM, macOS. All numbers are warm cache, after index build.

Search: Linux kernel (92,947 files, 2.0GB)

Query xg ripgrep vs ripgrep
struct file_operations 37ms 1,687ms 46x faster
printk 52ms 1,756ms 34x faster
EXPORT_SYMBOL 66ms 1,773ms 27x faster

File discovery: next.js (27,332 files)

Query xg --find fd vs fd
*.ts (4,838 files) 20.8ms 187.3ms 9x faster
config (substring) 12.7ms 188.1ms 15x faster

Index cost

Metric xgrep zoekt
Build time (Linux kernel) 6s 46s
Index size 175MB (8% of source) 3.0GB (155%)
Breakeven ~2 searches

First run includes a one-time index build. See docs/benchmarks.md for full results including medium/small repos.

Limitations

  • Short queries (< 3 chars) bypass the index — no speed advantage over ripgrep
  • Tiny files (< 3 bytes) hold no trigrams and are invisible to indexed content search — a deliberate trade-off of the trigram index
  • Index staleness — background rebuild runs every ~30s. Use --fresh for up-to-date results
  • find_definitions uses regex heuristics, not AST analysis — false positives expected

When to use ripgrep instead: one-off searches, very small codebases (< 100 files), or queries shorter than 3 characters.

How It Works

  1. Index Build: Walks the codebase, extracts 3-byte trigrams from each file, builds an inverted index with delta+varint compression
  2. Search: Extracts trigrams from query, intersects posting lists to find candidate files, verifies matches
  3. Hybrid Mode: Combines index results with direct scanning of changed files when index is stale
  4. MCP Server: Exposes search via JSON-RPC over stdio, with token-aware truncation

Contributing

See CONTRIBUTING.md for development setup and guidelines.

License

MIT

Reviews (0)

No results found