fast_file_search
Health Uyari
- License — License: MIT
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Low visibility — Only 5 GitHub stars
Code Gecti
- Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Gecti
- Permissions — No dangerous permissions requested
Bu listing icin henuz AI raporu yok.
Ultra-fast file search toolkit for AI agents, Neovim, and code editors. MCP server for Claude Code/Cursor/Codex, frecency-ranked search, symbol indexing via tree-sitter, and a Rust core that outperforms ripgrep and fzf in long-running processes.
A code-aware file search CLI for humans and AI agents. Really fast.
ffs replaces find, fd, grep, rg, glob, and cat with a single
typo-resistant, frecency-ranked binary, and adds tree-sitter powered
code-navigation (symbol, callers, callees, refs, flow, impact)
and a token-budget aware file reader for AI agents.
Install
curl -fsSL "https://raw.githubusercontent.com/quangdang46/fast_file_search/main/install.sh?$(date +%s)" | bash
Windows (PowerShell):
irm https://raw.githubusercontent.com/quangdang46/fast_file_search/main/install.ps1 | iex
The scripts detect your platform, fetch the matching binary from
GitHub Releases,
verify the SHA-256 sidecar, and atomically install to ~/.local/bin/ffs
(Unix) or %LOCALAPPDATA%\ffs\bin (Windows).
Supported platforms
- Linux x86_64 / aarch64 (musl-linked, portable across glibc versions)
- macOS x86_64 / aarch64
- Windows x86_64 / aarch64
Quick start
ffs --help
ffs index # one-time warm-up (~200ms on a 10k-file repo)
ffs find UnifiedScanner
ffs find grep --scored # role-aware file search (+20 impl, -15 test)
ffs grep '\bTODO\b' --root crates/
ffs grep 'fn main' --group # symbol-grouped grep output
ffs symbol FilePicker
ffs callers UnifiedScanner
ffs read crates/ffs-engine/src/lib.rs --budget 5000 --filter minimal
ffs map --depth 3
ffs mcp # run as MCP server over stdio
ffs index writes a tree-sitter symbol-index cache to <repo>/.ffs/. Subsequentffs symbol/callers/refs/flow/siblings/impact invocations skip the
re-parse and load the cache directly — sub-200 ms on a Linux-kernel-sized repo.
The cache is invalidated automatically on schema bumps, git HEAD changes, or
significant file-count drift. Add .ffs/ to your .gitignore (the
repository's own .gitignore already does this).
Subcommands
ffs find Find files by name. --scored for role-aware ranking (impl +20, test -15)
ffs glob Match files by glob (replaces glob, shell **)
ffs grep Search file contents. --group for symbol-grouped output
ffs read Read a file with token-budget aware truncation (replaces cat)
ffs outline Render a file's structural outline
ffs symbol Look up symbol definitions (tree-sitter powered)
ffs callers List call sites of a symbol
ffs callees List symbols referenced inside a symbol body
ffs refs Definitions + single-hop usages of a symbol in one shot
ffs flow Drill-down envelope per definition (def + body + callees + callers)
ffs siblings Sibling symbols (peers in the same parent scope)
ffs deps File imports + the workspace files that depend on it
ffs impact Rank files by how much they'd be affected if a symbol changed
ffs index Build / refresh the on-disk indexes (Bigram, Bloom, Symbol, Outline)
ffs map Render the workspace as a tree annotated with file count and tokens
ffs overview High-signal summary of the workspace (languages, top symbols, ...)
ffs mcp Run as an MCP server over stdio
ffs guide Print the embedded agent guide
Pass --format json for machine-readable output. Use --root <path> to override
the working directory globally.
MCP server
ffs mcp (or the standalone ffs-mcp binary) speaks JSON-RPC over stdio
and registers 15 tools that any MCP-capable agent (Claude Code, Codex,
OpenCode, Cursor, Cline, …) can call:
Tools registered
| Tool | What it answers |
|---|---|
ffs_find |
Fuzzy file-name search. Smart-case, frecency-ranked, glob constraints, git-aware. |
ffs_grep |
Content search. Plain / regex / fuzzy auto-detect, pagination cursor, definition-first hinting. |
ffs_multi_grep |
OR-logic multi-pattern content search via SIMD Aho-Corasick. |
ffs_symbol |
Exact + prefix lookup over the tree-sitter symbol index (16 languages). |
ffs_callers |
Find call sites of a symbol. Bloom-filter narrowed candidates → literal-text confirm pass. |
ffs_callees |
Symbols referenced inside the body of a definition. |
ffs_refs |
Definitions + single-hop usages of a symbol in one shot. Mirrors ffs refs from the CLI. |
ffs_flow |
Drill-down envelope per definition: def metadata + body excerpt + top-N callees + top-N callers. |
ffs_impact |
Rank workspace files by how much they'd be affected if name changed. |
ffs_outline |
Structural outline of a file (functions, classes, top-level decls). Agent-friendly default view. |
ffs_siblings |
Peers of a symbol in its parent scope — surfaces the rest of the impl/class around a method. |
ffs_deps |
A file's imports plus the workspace files that depend on it. Blast-radius estimate for changes. |
ffs_map |
Workspace tree annotated with file count and per-directory token estimate. |
ffs_overview |
High-signal repo summary: languages, top-defined symbols, entry-point candidates. |
ffs_read |
Token-budget aware file read. Maps maxTokens to ~85% body × 4 bytes/token, applies filters. |
Recommended agent prompt — drop into CLAUDE.md (or equivalent):
For any file search, grep, or symbol lookup in the current git-indexed
directory, use ffs tools.
Why ffs
- Typo-resistant fuzzy matching for both paths and content.
*.rs !test/ shcema
is a valid query; even with a typo inshcemait still finds matches. - Frecency-ranked. Files you open often rank higher next time. Warm-up uses
git touch history. - Tree-sitter symbol index across 16 languages — answer code questions, not
just file-name questions. - Bigram + Bloom pre-filter stack.
ffs callers SomeSymboltypically inspects
fewer than 30 files on a 10k-file repo. - Token-budget aware reader for AI agents.
ffs read path --budget 5000clips
the body but always preserves the file header and a[truncated to budget]
footer so the agent knows the output was clipped. - One long-lived process when used via library/MCP. No per-call subprocess
spawn, no re-reading.gitignore, no rebuilding state. After the first call
every subsequent search hits warm memory.
On a 500k-file Chromium checkout, that is the difference between 3-9 seconds
per ripgrep spawn and sub-10 ms per ffs query.
Architecture
ffs is a single Rust workspace organised as a layered core with multiple
thin frontends. Every surface (CLI, MCP, Neovim, Node, Bun, C ABI) calls
into the same engine — there is no duplicated search logic.
Layered design
┌──────────────────────────────────────────────────────────────────────┐
│ Frontends │
│ ───────── │
│ ffs-cli ffs-mcp ffs-c │
│ (binary) (stdio (C ABI │
│ JSON-RPC) .so/.dylib/.dll) │
└────────────────────────────────┬─────────────────────────────────────┘
│ all surfaces share one core
▼
┌──────────────────────────────────────────────────────────────────────┐
│ Engine layer │
│ ──────────── │
│ ffs-engine unified scanner · dispatch · ranking · memory │
│ ffs-query-parser `*.rs !test/ shcema` → constraints + modes │
└────────────────────────────────┬─────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────────┐
│ Capability layer │
│ ──────────────── │
│ ffs-symbol tree-sitter index · bloom · bigram pre-filter │
│ ffs-grep SIMD literal / regex grep │
│ ffs-budget token-aware reader · comment + whitespace filters │
└────────────────────────────────┬─────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────────┐
│ Core layer (ffs-core) │
│ ───────────────────── │
│ scan · file_picker · score · bigram_filter · git · frecency │
│ background_watcher · ignore · simd_path · constraints │
└──────────────────────────────────────────────────────────────────────┘
Each layer only depends on the ones below it. Adding a new frontend
(e.g. a Python binding) means wrapping ffs-c; it never reaches intoffs-core directly.
Crate responsibilities
| Crate | Role |
|---|---|
ffs-core |
Filesystem scan, frecency, fuzzy scoring, bigram filter, git integration, watcher. |
ffs-query-parser |
Parses the query DSL (globs, negations, regex, fuzzy fallback). |
ffs-symbol |
Tree-sitter symbol index, bloom filter, outline cache, on-disk artifact format. |
ffs-grep |
SIMD literal & regex content search backend. |
ffs-budget |
Token-budget aware file reader and content filters for AI agents. |
ffs-engine |
Glue layer: dispatch, ranking, prefilter, in-memory state coordination. |
ffs-cli |
The ffs binary, subcommand routing, on-disk cache (.ffs/). |
ffs-mcp |
JSON-RPC MCP server exposing 16 tools over stdio. |
ffs-c |
Stable C ABI (libffs_c, header in crates/ffs-c/include/ffs.h). |
Query path (e.g. ffs callers UnifiedScanner)
user input on-disk artifacts in <repo>/.ffs/
─────────── ─────────────────────────────────
│ ┌────────────────────────────┐
▼ │ symbol_index.postcard.zst │
┌─────────────┐ │ bigram.postcard.zst │
│ ffs-cli │ │ meta.json (HEAD, schema) │
│ subcommand │ └─────────────┬──────────────┘
│ dispatch │ │ mmap + decode
└──────┬──────┘ ▼
│ ┌──────────────┐
▼ │ ffs-symbol │
┌─────────────┐ parse query DSL │ index + bloom│
│ ffs-query- │ ────────────────────► └──────┬───────┘
│ parser │ │
└──────┬──────┘ │ candidate
│ Mode + Constraints │ file set
▼ │
┌─────────────────────────────────────────┐ │
│ ffs-engine │ ◄───┘
│ classify ▸ prefilter ▸ dispatch │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ symbol bigram grep / scan │
│ lookup filter backends │
└────────────────────┬────────────────────┘
│ ranked hits
▼
┌─────────────────────────────────────────┐
│ ffs-engine::ranking │
│ frecency · fuzzy score · git-touch │
└────────────────────┬────────────────────┘
▼
┌───────────────┐
│ formatter │ text │ json │ MCP tool result
└───────────────┘
Indexing path (ffs index)
walk repo (gitignore-aware, parallel)
────────────────────────────────────────► ffs-core::scan
│
▼
┌──────────────────┐
│ ffs-symbol │
│ tree-sitter │
│ parse · extract │
│ decls + scopes │
└────────┬─────────┘
▼
┌──────────────────┐
│ build artifacts │
│ • bigram │
│ • bloom │
│ • symbol index │
│ • outline cache │
└────────┬─────────┘
▼
write `<repo>/.ffs/*.postcard.zst`
+ meta.json (schema · HEAD · count)
The cache invalidates automatically on schema bumps, git HEAD changes,
or significant file-count drift. Subsequent ffs symbol / callers /refs / flow / siblings / impact invocations skip the re-parse
and load the cache directly — sub-200 ms on a Linux-kernel-sized repo.
Background watcher (long-lived processes)
When ffs is embedded as a library (ffs-c, ffs-mcp) it
keeps a single process alive and runs a notify-based background thread
that incrementally updates the in-memory index on filesystem events.
That is why MCP hits stay sub-10 ms after the first call —
no .gitignore re-read, no cold scan, no subprocess spawn.
┌─────────────────┐ fs events ┌──────────────────────┐
│ host process │ ◄──────────────────────│ background_watcher │
│ (mcp / cli / │ │ (notify crate) │
│ external C) │ └──────────┬───────────┘
│ │ │ patch
│ ┌─────────────┐ │ ▼
│ │ in-memory │ ├──── query ────────────► ffs-engine ──► result
│ │ index + │ │
│ │ frecency DB │ │
│ └─────────────┘ │
└─────────────────┘
Performance & Benchmarks
All numbers below are single-threaded medians measured with
Criterion.rs on a Linux x86-64
machine. Benchmarks run weekly in CI (bench-track.yml); the raw Criterion
HTML reports are published as GitHub Actions artifacts.
Engine dispatch (ffs-engine)
End-to-end latency from raw query string to ranked results over a 256-file
fixture repo (32 dirs × 8 files).
| Query type | Example | Median |
|---|---|---|
| Symbol lookup | worker_05_3 |
202 ns |
| Concept / NL | how does the worker handle payloads |
205 ns |
| File path | mod_03/file_2.rs |
2.2 µs |
| Glob | **/*.rs |
2.8 µs |
Query classification alone: 107–1,640 ns depending on pattern complexity.
Scoring a single result (score_one): ~3 ns.
Bigram index (ffs-core)
Query latency across different index sizes:
| Index size | 2-char query | 6-char query | 14-char query |
|---|---|---|---|
| 10 K files | 46 ns | 120 ns | 314 ns |
| 100 K files | 368 ns | 410 ns | 443 ns |
| 500 K files | 761 ns | 1.6 µs | 1.6 µs |
Index build (100 K files): 86 ms (short content) / 2.5 s (4 KB/file).
Case-insensitive memmem (ffs-core, AVX2 packed-pair)
Searching real source files from this repo:
| Haystack | Needle | packed_pair | memchr2 baseline | Speedup |
|---|---|---|---|---|
| grep.rs (80 KB) | "fn" (hit) |
44 ns | 204 ns | 4.6× |
| grep.rs (80 KB) | "search_file" (hit) |
1.6 µs | 13.6 µs | 8.6× |
| combined (1 MB) | "content_cache_budget" (hit) |
37 µs | 400 µs | 10.8× |
Query parser (ffs-query-parser)
| Query | Median |
|---|---|
*.rs (extension) |
60 ns |
struct (simple text) |
234 ns |
src name *.rs !test /lib/ status:modified (complex) |
523 ns |
| 26-token worst case | 3.7 µs |
Symbol index (ffs-symbol)
| Operation | Median |
|---|---|
| Index one Rust file (50 lines) | 219 µs |
| Exact symbol lookup (1 K files indexed) | 95 µs |
Prefix lookup ("Wor") |
115 µs |
| Bloom insert (8 K identifiers) | 35 µs |
Token budget (ffs-budget)
| Operation | Size | Median |
|---|---|---|
| Smart truncate | 128 lines | 5.6 µs |
| Smart truncate | 8 192 lines | 374 µs |
| Comment filter (none) | 64 KB | 1.4 µs |
| Comment filter (aggressive) | 64 KB | 329 µs |
| Comment filter (aggressive) | 512 KB | 2.4 ms |
# requires Zig for zlob feature
cargo bench --features zlob \
-p ffs-search -p ffs-query-parser \
-p ffs-symbol -p ffs-budget -p ffs-engine
# HTML reports land in target/criterion/
C ABI
The C library provides a stable ABI for binding from other languages:
make install # installs libffs_c and ffs.h to /usr/local
See crates/ffs-c/include/ffs.h for the API surface.
Build from source
git clone https://github.com/quangdang46/fast_file_search.git
cd fast_file_search
cargo build --release -p ffs-cli --features zlob
./target/release/ffs --version
zlob enables a Zig-compiled glob matcher; requires Zig at build time.
Without it, ffs falls back to globset (pure Rust). Drop --features zlob
if you don't have Zig installed.
The full workspace (make build) also produces:
target/release/ffs-mcp— MCP server binarytarget/release/libffs_c.{so,dylib,dll}— C FFI library
Repository layout
crates/
ffs-core/ # Rust core SDK
ffs-cli/ # the `ffs` binary
ffs-mcp/ # MCP server (`ffs-mcp` binary)
ffs-c/ # C FFI library (libffs_c, header in include/ffs.h)
ffs-engine/ # unified scanner + dispatch + ranking
ffs-symbol/ # tree-sitter symbol index, bloom + bigram filters
ffs-budget/ # token-budget reader, comment/whitespace filters
ffs-grep/ # SIMD literal/regex grep
ffs-query-parser/ # query language parser (constraints, fuzzy, regex modes)
install.sh # CLI installer (this README's curl|bash target)
install-mcp.sh # MCP server installer
.github/workflows/
release.yaml # cross-compile + GitHub Releases on push to main and v* tags
rust.yml # fmt + clippy + test on every PR
…
Rust library API
FFS crates can be used directly as Rust dependencies:
[dependencies]
ffs-engine = { git = "https://github.com/quangdang46/fast_file_search" }
ffs-search = { git = "https://github.com/quangdang46/fast_file_search" }
ffs-symbol = { git = "https://github.com/quangdang46/fast_file_search" }
High-level API (ffs_engine::api)
use ffs_engine::api::{grep, find, outline, GrepOptions, FindOptions};
// Grep with symbol grouping
let result = grep(root, "fn main", &GrepOptions::default());
// result.files[0].groups[0].matches — grouped by enclosing symbol
// Find with role-based scoring
let result = find(root, "auth", &FindOptions::default());
// result.files[0].score, .role, .score_breakdown
// File outline
let result = outline(Path::new("src/main.rs"));
// result.entries — tree-sitter OutlineEntry[]
Role detection (ffs_search::role)
use ffs_search::role::detect_role;
let role = detect_role(Path::new("src/tests/mod.rs"));
assert_eq!(role.as_str(), "test"); // auto-penalized: -15
Contributing
PRs welcome. Run make check before submitting:
make format(rustfmt)make lint(clippy)make test
Agentic coding tools are welcome to be used; human review is mandatory.
License
MIT — open source forever.
Yorumlar (0)
Yorum birakmak icin giris yap.
Yorum birakSonuc bulunamadi