fast_file_search

mcp
Security Audit
Warn
Health Warn
  • License — License: MIT
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Low visibility — Only 5 GitHub stars
Code Pass
  • Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Pass
  • Permissions — No dangerous permissions requested

No AI report is available for this listing yet.

SUMMARY

Ultra-fast file search toolkit for AI agents, Neovim, and code editors. MCP server for Claude Code/Cursor/Codex, frecency-ranked search, symbol indexing via tree-sitter, and a Rust core that outperforms ripgrep and fzf in long-running processes.

README.md

A code-aware file search CLI for humans and AI agents. Really fast.

ffs replaces find, fd, grep, rg, glob, and cat with a single
typo-resistant, frecency-ranked binary, and adds tree-sitter powered
code-navigation (symbol, callers, callees, refs, flow, impact)
and a token-budget aware file reader for AI agents.


Install

curl -fsSL "https://raw.githubusercontent.com/quangdang46/fast_file_search/main/install.sh?$(date +%s)" | bash

Windows (PowerShell):

irm https://raw.githubusercontent.com/quangdang46/fast_file_search/main/install.ps1 | iex

The scripts detect your platform, fetch the matching binary from
GitHub Releases,
verify the SHA-256 sidecar, and atomically install to ~/.local/bin/ffs
(Unix) or %LOCALAPPDATA%\ffs\bin (Windows).

Supported platforms

  • Linux x86_64 / aarch64 (musl-linked, portable across glibc versions)
  • macOS x86_64 / aarch64
  • Windows x86_64 / aarch64

Quick start

ffs --help
ffs index                                  # one-time warm-up (~200ms on a 10k-file repo)
ffs find UnifiedScanner
ffs find grep --scored                     # role-aware file search (+20 impl, -15 test)
ffs grep '\bTODO\b' --root crates/
ffs grep 'fn main' --group                # symbol-grouped grep output
ffs symbol FilePicker
ffs callers UnifiedScanner
ffs read crates/ffs-engine/src/lib.rs --budget 5000 --filter minimal
ffs map --depth 3
ffs mcp                                    # run as MCP server over stdio

ffs index writes a tree-sitter symbol-index cache to <repo>/.ffs/. Subsequent
ffs symbol/callers/refs/flow/siblings/impact invocations skip the
re-parse and load the cache directly — sub-200 ms on a Linux-kernel-sized repo.
The cache is invalidated automatically on schema bumps, git HEAD changes, or
significant file-count drift. Add .ffs/ to your .gitignore (the
repository's own .gitignore already does this).

Subcommands

ffs find        Find files by name. --scored for role-aware ranking (impl +20, test -15)
ffs glob        Match files by glob (replaces glob, shell **)
ffs grep        Search file contents. --group for symbol-grouped output
ffs read        Read a file with token-budget aware truncation (replaces cat)
ffs outline     Render a file's structural outline
ffs symbol      Look up symbol definitions (tree-sitter powered)
ffs callers     List call sites of a symbol
ffs callees     List symbols referenced inside a symbol body
ffs refs        Definitions + single-hop usages of a symbol in one shot
ffs flow        Drill-down envelope per definition (def + body + callees + callers)
ffs siblings    Sibling symbols (peers in the same parent scope)
ffs deps        File imports + the workspace files that depend on it
ffs impact      Rank files by how much they'd be affected if a symbol changed
ffs index       Build / refresh the on-disk indexes (Bigram, Bloom, Symbol, Outline)
ffs map         Render the workspace as a tree annotated with file count and tokens
ffs overview    High-signal summary of the workspace (languages, top symbols, ...)
ffs mcp         Run as an MCP server over stdio
ffs guide       Print the embedded agent guide

Pass --format json for machine-readable output. Use --root <path> to override
the working directory globally.


MCP server

ffs mcp (or the standalone ffs-mcp binary) speaks JSON-RPC over stdio
and registers 15 tools that any MCP-capable agent (Claude Code, Codex,
OpenCode, Cursor, Cline, …) can call:

Tools registered

Tool What it answers
ffs_find Fuzzy file-name search. Smart-case, frecency-ranked, glob constraints, git-aware.
ffs_grep Content search. Plain / regex / fuzzy auto-detect, pagination cursor, definition-first hinting.
ffs_multi_grep OR-logic multi-pattern content search via SIMD Aho-Corasick.
ffs_symbol Exact + prefix lookup over the tree-sitter symbol index (16 languages).
ffs_callers Find call sites of a symbol. Bloom-filter narrowed candidates → literal-text confirm pass.
ffs_callees Symbols referenced inside the body of a definition.
ffs_refs Definitions + single-hop usages of a symbol in one shot. Mirrors ffs refs from the CLI.
ffs_flow Drill-down envelope per definition: def metadata + body excerpt + top-N callees + top-N callers.
ffs_impact Rank workspace files by how much they'd be affected if name changed.
ffs_outline Structural outline of a file (functions, classes, top-level decls). Agent-friendly default view.
ffs_siblings Peers of a symbol in its parent scope — surfaces the rest of the impl/class around a method.
ffs_deps A file's imports plus the workspace files that depend on it. Blast-radius estimate for changes.
ffs_map Workspace tree annotated with file count and per-directory token estimate.
ffs_overview High-signal repo summary: languages, top-defined symbols, entry-point candidates.
ffs_read Token-budget aware file read. Maps maxTokens to ~85% body × 4 bytes/token, applies filters.

Recommended agent prompt — drop into CLAUDE.md (or equivalent):

For any file search, grep, or symbol lookup in the current git-indexed
directory, use ffs tools.

Why ffs

  • Typo-resistant fuzzy matching for both paths and content. *.rs !test/ shcema
    is a valid query; even with a typo in shcema it still finds matches.
  • Frecency-ranked. Files you open often rank higher next time. Warm-up uses
    git touch history.
  • Tree-sitter symbol index across 16 languages — answer code questions, not
    just file-name questions.
  • Bigram + Bloom pre-filter stack. ffs callers SomeSymbol typically inspects
    fewer than 30 files on a 10k-file repo.
  • Token-budget aware reader for AI agents. ffs read path --budget 5000 clips
    the body but always preserves the file header and a [truncated to budget]
    footer so the agent knows the output was clipped.
  • One long-lived process when used via library/MCP. No per-call subprocess
    spawn, no re-reading .gitignore, no rebuilding state. After the first call
    every subsequent search hits warm memory.

On a 500k-file Chromium checkout, that is the difference between 3-9 seconds
per ripgrep spawn and sub-10 ms per ffs query.


Architecture

ffs is a single Rust workspace organised as a layered core with multiple
thin frontends. Every surface (CLI, MCP, Neovim, Node, Bun, C ABI) calls
into the same engine — there is no duplicated search logic.

Layered design

┌──────────────────────────────────────────────────────────────────────┐
│  Frontends                                                           │
│  ─────────                                                           │
│  ffs-cli    ffs-mcp    ffs-c                                         │
│  (binary)   (stdio     (C ABI                                        │
│             JSON-RPC)  .so/.dylib/.dll)                              │
└────────────────────────────────┬─────────────────────────────────────┘
                                 │ all surfaces share one core
                                 ▼
┌──────────────────────────────────────────────────────────────────────┐
│  Engine layer                                                        │
│  ────────────                                                        │
│  ffs-engine          unified scanner · dispatch · ranking · memory   │
│  ffs-query-parser    `*.rs !test/ shcema` → constraints + modes      │
└────────────────────────────────┬─────────────────────────────────────┘
                                 │
                                 ▼
┌──────────────────────────────────────────────────────────────────────┐
│  Capability layer                                                    │
│  ────────────────                                                    │
│  ffs-symbol     tree-sitter index · bloom · bigram pre-filter        │
│  ffs-grep       SIMD literal / regex grep                            │
│  ffs-budget     token-aware reader · comment + whitespace filters    │
└────────────────────────────────┬─────────────────────────────────────┘
                                 │
                                 ▼
┌──────────────────────────────────────────────────────────────────────┐
│  Core layer (ffs-core)                                               │
│  ─────────────────────                                               │
│  scan · file_picker · score · bigram_filter · git · frecency         │
│  background_watcher · ignore · simd_path · constraints               │
└──────────────────────────────────────────────────────────────────────┘

Each layer only depends on the ones below it. Adding a new frontend
(e.g. a Python binding) means wrapping ffs-c; it never reaches into
ffs-core directly.

Crate responsibilities

Crate Role
ffs-core Filesystem scan, frecency, fuzzy scoring, bigram filter, git integration, watcher.
ffs-query-parser Parses the query DSL (globs, negations, regex, fuzzy fallback).
ffs-symbol Tree-sitter symbol index, bloom filter, outline cache, on-disk artifact format.
ffs-grep SIMD literal & regex content search backend.
ffs-budget Token-budget aware file reader and content filters for AI agents.
ffs-engine Glue layer: dispatch, ranking, prefilter, in-memory state coordination.
ffs-cli The ffs binary, subcommand routing, on-disk cache (.ffs/).
ffs-mcp JSON-RPC MCP server exposing 16 tools over stdio.
ffs-c Stable C ABI (libffs_c, header in crates/ffs-c/include/ffs.h).

Query path (e.g. ffs callers UnifiedScanner)

   user input                        on-disk artifacts in <repo>/.ffs/
   ───────────                       ─────────────────────────────────
        │                            ┌────────────────────────────┐
        ▼                            │ symbol_index.postcard.zst  │
   ┌─────────────┐                   │ bigram.postcard.zst        │
   │ ffs-cli     │                   │ meta.json (HEAD, schema)   │
   │ subcommand  │                   └─────────────┬──────────────┘
   │ dispatch    │                                 │ mmap + decode
   └──────┬──────┘                                 ▼
          │                                  ┌──────────────┐
          ▼                                  │ ffs-symbol   │
   ┌─────────────┐    parse query DSL        │ index + bloom│
   │ ffs-query-  │ ────────────────────►     └──────┬───────┘
   │ parser      │                                  │
   └──────┬──────┘                                  │ candidate
          │ Mode + Constraints                      │ file set
          ▼                                         │
   ┌─────────────────────────────────────────┐     │
   │ ffs-engine                              │ ◄───┘
   │  classify ▸ prefilter ▸ dispatch        │
   │     │          │           │            │
   │     ▼          ▼           ▼            │
   │  symbol     bigram      grep / scan     │
   │  lookup     filter      backends        │
   └────────────────────┬────────────────────┘
                        │ ranked hits
                        ▼
   ┌─────────────────────────────────────────┐
   │ ffs-engine::ranking                     │
   │   frecency · fuzzy score · git-touch    │
   └────────────────────┬────────────────────┘
                        ▼
                ┌───────────────┐
                │ formatter     │  text │ json │ MCP tool result
                └───────────────┘

Indexing path (ffs index)

  walk repo (gitignore-aware, parallel)
  ────────────────────────────────────────►   ffs-core::scan
                                                    │
                                                    ▼
                                           ┌──────────────────┐
                                           │ ffs-symbol       │
                                           │  tree-sitter     │
                                           │  parse · extract │
                                           │  decls + scopes  │
                                           └────────┬─────────┘
                                                    ▼
                                           ┌──────────────────┐
                                           │ build artifacts  │
                                           │  • bigram        │
                                           │  • bloom         │
                                           │  • symbol index  │
                                           │  • outline cache │
                                           └────────┬─────────┘
                                                    ▼
                                           write `<repo>/.ffs/*.postcard.zst`
                                           + meta.json (schema · HEAD · count)

The cache invalidates automatically on schema bumps, git HEAD changes,
or significant file-count drift. Subsequent ffs symbol / callers /
refs / flow / siblings / impact invocations skip the re-parse
and load the cache directly — sub-200 ms on a Linux-kernel-sized repo.

Background watcher (long-lived processes)

When ffs is embedded as a library (ffs-c, ffs-mcp) it
keeps a single process alive and runs a notify-based background thread
that incrementally updates the in-memory index on filesystem events.
That is why MCP hits stay sub-10 ms after the first call —
no .gitignore re-read, no cold scan, no subprocess spawn.

┌─────────────────┐       fs events        ┌──────────────────────┐
│ host process    │ ◄──────────────────────│ background_watcher   │
│ (mcp / cli /    │                        │  (notify crate)      │
│  external C)    │                        └──────────┬───────────┘
│                 │                                   │ patch
│ ┌─────────────┐ │                                   ▼
│ │ in-memory   │ ├──── query ────────────►   ffs-engine ──► result
│ │ index +     │ │
│ │ frecency DB │ │
│ └─────────────┘ │
└─────────────────┘

Performance & Benchmarks

All numbers below are single-threaded medians measured with
Criterion.rs on a Linux x86-64
machine. Benchmarks run weekly in CI (bench-track.yml); the raw Criterion
HTML reports are published as GitHub Actions artifacts.

Engine dispatch (ffs-engine)

End-to-end latency from raw query string to ranked results over a 256-file
fixture repo (32 dirs × 8 files).

Query type Example Median
Symbol lookup worker_05_3 202 ns
Concept / NL how does the worker handle payloads 205 ns
File path mod_03/file_2.rs 2.2 µs
Glob **/*.rs 2.8 µs

Query classification alone: 107–1,640 ns depending on pattern complexity.
Scoring a single result (score_one): ~3 ns.

Bigram index (ffs-core)

Query latency across different index sizes:

Index size 2-char query 6-char query 14-char query
10 K files 46 ns 120 ns 314 ns
100 K files 368 ns 410 ns 443 ns
500 K files 761 ns 1.6 µs 1.6 µs

Index build (100 K files): 86 ms (short content) / 2.5 s (4 KB/file).

Case-insensitive memmem (ffs-core, AVX2 packed-pair)

Searching real source files from this repo:

Haystack Needle packed_pair memchr2 baseline Speedup
grep.rs (80 KB) "fn" (hit) 44 ns 204 ns 4.6×
grep.rs (80 KB) "search_file" (hit) 1.6 µs 13.6 µs 8.6×
combined (1 MB) "content_cache_budget" (hit) 37 µs 400 µs 10.8×

Query parser (ffs-query-parser)

Query Median
*.rs (extension) 60 ns
struct (simple text) 234 ns
src name *.rs !test /lib/ status:modified (complex) 523 ns
26-token worst case 3.7 µs

Symbol index (ffs-symbol)

Operation Median
Index one Rust file (50 lines) 219 µs
Exact symbol lookup (1 K files indexed) 95 µs
Prefix lookup ("Wor") 115 µs
Bloom insert (8 K identifiers) 35 µs

Token budget (ffs-budget)

Operation Size Median
Smart truncate 128 lines 5.6 µs
Smart truncate 8 192 lines 374 µs
Comment filter (none) 64 KB 1.4 µs
Comment filter (aggressive) 64 KB 329 µs
Comment filter (aggressive) 512 KB 2.4 ms
Reproduce locally
# requires Zig for zlob feature
cargo bench --features zlob \
  -p ffs-search -p ffs-query-parser \
  -p ffs-symbol -p ffs-budget -p ffs-engine

# HTML reports land in target/criterion/

C ABI

The C library provides a stable ABI for binding from other languages:

make install  # installs libffs_c and ffs.h to /usr/local

See crates/ffs-c/include/ffs.h for the API surface.


Build from source

git clone https://github.com/quangdang46/fast_file_search.git
cd fast_file_search
cargo build --release -p ffs-cli --features zlob
./target/release/ffs --version

zlob enables a Zig-compiled glob matcher; requires Zig at build time.
Without it, ffs falls back to globset (pure Rust). Drop --features zlob
if you don't have Zig installed.

The full workspace (make build) also produces:

  • target/release/ffs-mcp — MCP server binary
  • target/release/libffs_c.{so,dylib,dll} — C FFI library

Repository layout

crates/
  ffs-core/         # Rust core SDK
  ffs-cli/          # the `ffs` binary
  ffs-mcp/          # MCP server (`ffs-mcp` binary)
  ffs-c/            # C FFI library (libffs_c, header in include/ffs.h)
  ffs-engine/       # unified scanner + dispatch + ranking
  ffs-symbol/       # tree-sitter symbol index, bloom + bigram filters
  ffs-budget/       # token-budget reader, comment/whitespace filters
  ffs-grep/         # SIMD literal/regex grep
  ffs-query-parser/ # query language parser (constraints, fuzzy, regex modes)
install.sh          # CLI installer (this README's curl|bash target)
install-mcp.sh      # MCP server installer
.github/workflows/
  release.yaml      # cross-compile + GitHub Releases on push to main and v* tags
  rust.yml          # fmt + clippy + test on every PR
  …


Rust library API

FFS crates can be used directly as Rust dependencies:

[dependencies]
ffs-engine = { git = "https://github.com/quangdang46/fast_file_search" }
ffs-search = { git = "https://github.com/quangdang46/fast_file_search" }
ffs-symbol = { git = "https://github.com/quangdang46/fast_file_search" }

High-level API (ffs_engine::api)

use ffs_engine::api::{grep, find, outline, GrepOptions, FindOptions};

// Grep with symbol grouping
let result = grep(root, "fn main", &GrepOptions::default());
// result.files[0].groups[0].matches — grouped by enclosing symbol

// Find with role-based scoring
let result = find(root, "auth", &FindOptions::default());
// result.files[0].score, .role, .score_breakdown

// File outline
let result = outline(Path::new("src/main.rs"));
// result.entries — tree-sitter OutlineEntry[]

Role detection (ffs_search::role)

use ffs_search::role::detect_role;

let role = detect_role(Path::new("src/tests/mod.rs"));
assert_eq!(role.as_str(), "test");  // auto-penalized: -15

Contributing

PRs welcome. Run make check before submitting:

  • make format (rustfmt)
  • make lint (clippy)
  • make test

Agentic coding tools are welcome to be used; human review is mandatory.

License

MIT — open source forever.

Reviews (0)

No results found