quant
Health Uyari
- License — License: MIT
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Low visibility — Only 5 GitHub stars
Code Basarisiz
- rm -rf — Recursive force deletion command in scripts/coverage.sh
- rm -rf — Recursive force deletion command in scripts/install.sh
Permissions Gecti
- Permissions — No dangerous permissions requested
This is a local-first Retrieval-Augmented Generation (RAG) index that monitors your local filesystem, extracts and embeds file contents via Ollama or a compatible API, stores the data in a local SQLite database, and serves semantic search capabilities as an MCP server. It is primarily designed to provide coding agents with project-scoped context.
Security Assessment
The tool inherently accesses sensitive data by design—it watches, reads, and indexes your local files to build its database. It requires external network access to communicate with the Ollama instance or an OpenAI-compatible embedding API specified via the `--embed-url` flag. No hardcoded secrets were detected, and it does not request explicitly dangerous permissions. However, the scan flagged two FAIL results for recursive force deletions (`rm -rf`) inside helper scripts (`scripts/coverage.sh` and `scripts/install.sh`). While common in build and installation scripts for cleaning up directories, `rm -rf` commands always carry a risk of unintended data deletion if a variable is ever empty or malformed. Overall risk is rated as Medium due to the broad local filesystem read access and active network requests combined with the flagged script commands.
Quality Assessment
The project is licensed under the permissive MIT license and was actively updated recently (last push was today). However, it currently has very low community visibility, sitting at only 5 GitHub stars. Because of this low adoption, the codebase has not been extensively battle-tested or broadly reviewed by the open-source community. Caution is advised regarding its production readiness.
Verdict
Use with caution — the tool is actively maintained and serves a useful purpose, but its low community adoption, inherent broad filesystem read access, external API dependencies, and flagged shell cleanup commands warrant a thorough manual review of the installation scripts before deployment.
🧠 Local-first RAG index that watches your files and serves MCP semantic search
quant
A lightweight, developer-focused RAG index exposed as an MCP server. Point it at a folder and it watches the filesystem, extracts supported files, chunks them with structure awareness, embeds them via Ollama, stores them in SQLite, and serves semantic search over MCP.
The index is a projection of the filesystem. Files added, changed, or removed on disk are reflected in the index automatically.
In practice, quant is usually most useful as a project-scoped MCP - one server per repository, documentation set, or research workspace. See docs/mcp-clients.md for recommended deployment patterns and client configs.
Zero CGO. Pure Go.
Runtime requirements
A
quantbinary for your platform, either downloaded from GitHub Releases or built from sourceA coding agent or other MCP-capable client of your choice, such as Claude, Codex, OpenCode, or GitHub Copilot
Ollama installed locally, or an OpenAI-compatible embedding API at
--embed-urlquanthandles Ollama setup automatically on first run:- If Ollama is installed but not running,
quantstarts it in the background (ollama serve) - If the configured embedding model isn't pulled yet,
quantpulls it automatically (ollama pull <model>) - If the embedding backend is still unavailable after recovery attempts,
quantstarts in keyword-only mode so the MCP server remains usable
To set up manually instead:
ollama serve # start Ollama ollama pull nomic-embed-text- If Ollama is installed but not running,
Optional for scanned PDFs: ocrmypdf installed on your system
PATH. If present,quantwill automatically use it as a best-effort OCR sidecar for PDFs that contain no extractable text.
Install
macOS and Linux
The quickest install path on macOS and Linux is the release installer:
curl -fsSL https://raw.githubusercontent.com/koltyakov/quant/main/scripts/install.sh | sh
It installs quant to ~/.local/bin. The installer checks whether ollama is on PATH; if it is missing, it asks whether to install Ollama with the official shell installer and prints manual setup guidance if skipped.
Windows
On Windows, use the PowerShell installer:
irm https://raw.githubusercontent.com/koltyakov/quant/main/scripts/install.ps1 | iex
It installs quant.exe to %LOCALAPPDATA%\Programs\quant and adds it to your user PATH. The installer also checks for Ollama and offers to install it via winget.
Alternative: Go install
If you already have Go installed, you can also install from source:
go install github.com/koltyakov/quant/cmd/quant@latest
After installing:
quant version
Build from source
You only need Go if you are building quant yourself instead of using a release binary.
- Go 1.26.0+
make install
Usage
quant mcp [--dir <path>] [options]
quant init [client] [options]
quant launch <client> [--dir <path>] [-- <client args...>]
quant update
quant version
Commands:
| Command | Description |
|---|---|
mcp |
Start the MCP server |
init |
Scaffold a project MCP config and research assistant instructions |
launch |
Start a supported agent with quant MCP injected for this session |
update |
Check for and apply the latest GitHub release |
version |
Print the quant version and exit |
help |
Show top-level CLI help |
Core MCP flags:
| Flag | Default | Description |
|---|---|---|
--dir |
current working directory | Directory to watch and index |
For the full flag reference, environment variables, YAML config, include/exclude patterns, and auto-update settings see docs/configuration.md.
Quick examples
# Create a research workspace for Codex
quant init codex --dir ./my-research-project
# Launch Codex with quant MCP over ./data
quant launch codex
# Index a folder over stdio
quant mcp --dir ./my-project
# Update to the latest release
quant update
For clients with narrow MCP permission controls, quant init and quant launch also allow all quant MCP tools without prompting.
MCP Tools
| Tool | Description |
|---|---|
search |
Semantic search over indexed chunks. Params: query (required), limit, threshold, path, file_type, language |
list_sources |
List indexed documents. Params: limit |
index_status |
Stats: total docs, chunks, DB size, watch dir, model, embedding status, lifecycle state |
find_similar |
Find chunks similar to a given chunk by its ID. Params: chunk_id (required), limit |
drill_down |
Explore a topic by finding diverse chunks related to a seed chunk from a previous search. Params: chunk_id (required), limit |
summarize_matches |
Summarize all matching documents for a query — returns an overview of what the index contains on a topic. Params: query (required), limit |
list_collections |
List all named collections with their document and chunk counts |
delete_collection |
Delete all documents and chunks in a named collection. Params: collection (required) |
search embeds the query with the configured embedding model, uses SQLite FTS5 to prefilter candidate chunks, then reranks those candidates with normalized vector similarity. All results use Reciprocal Rank Fusion (RRF) scoring on a common 0-1 scale. If the embedding backend is unavailable, search falls back to keyword-only results automatically. The embedding_status field in the response indicates whether results are hybrid or keyword-only.
find_similar takes a chunk ID from a previous search result and returns the nearest neighbors from the HNSW index. Useful for discovering related content without formulating a new query.
drill_down is like find_similar but prioritizes diversity across documents — it spreads results across different source files to help explore a topic broadly rather than staying within one file.
summarize_matches runs a search and returns a high-level overview of which documents matched and what they contain, without returning individual chunks. Useful when you want a quick map of what the index knows about a subject.
All MCP tools return structured payloads for clients that support structuredContent, while still including a readable text fallback. Tool concurrency is bounded by --max-concurrent-tools (default 4).
Supported File Types
quant indexes common plain-text inputs by default, including source code, markup, config, data, and filename-only project files such as Dockerfile, Makefile, and similar repo metadata.
For document-style content, current support includes:
- Jupyter notebooks, with cell markers and captured text outputs
- PDF, with page markers like
[Page N] - Scanned PDF OCR via optional
ocrmypdffallback when a PDF has no embedded text - Rich text via RTF
- Modern Office/Open XML word-processing, presentation, and spreadsheet files
- OpenDocument text, spreadsheet, and presentation files
See docs/file-types.md for the full list of recognized extensions and special filenames.
Unsupported or binary files are skipped.
Architecture
flowchart TD
WD([Watched directory]) --> INDEX[Initial scan and watch updates]
INDEX --> PROC[Extract, chunk, and embed]
PROC --> OLLAMA[/Ollama API/]
PROC --> DB[(SQLite index)]
CLIENT([MCP client]) --> MCP[MCP server]
MCP --> QUERY[Embed query]
QUERY --> OLLAMA
MCP --> SEARCH[Hybrid search]
DB --> SEARCH
SEARCH --> MCP
- No CGO - uses
modernc.org/sqlite(pure Go SQLite) - Hybrid retrieval - SQLite FTS5 prefilter + normalized vector rerank via RRF
- Adaptive query weighting - identifier-like queries (camelCase, short tokens) upweight keyword signals; longer natural-language queries upweight vector signals. Weights are selected automatically per query.
- HNSW approximate nearest neighbors - in-memory HNSW graph (M=16, EfSearch=100) built from stored embeddings after initial sync; incremental add/delete during live indexing
- Int8 quantized embeddings - embeddings are L2-normalized and quantized to 1 byte/dimension (~4x storage savings, <1% recall loss)
- Bounded-memory rerank - top-k heap keeps vector reranking memory stable as candidate sets grow
- Lifecycle-aware readiness - startup indexing state (
starting->indexing->ready/degraded) is surfaced through readiness checks andindex_status - SQLite tuned for concurrency - WAL + busy timeout + multi-connection pool allow reads during writes
- Transactional indexing - chunk replacement happens in a single SQLite transaction per document, with incremental HNSW updates deferred until after commit
- Incremental reindexing - unchanged chunks reuse their stored embeddings, so only new or modified content is sent to the embedding backend
- File watching via
fsnotifywith 500ms debounce and self-healing resync on overflow - Embedding caching - LRU cache with in-flight deduplication and circuit breaker for query-time embedding calls
See docs/architecture.md for the internal package layout.
Further reading
- Configuration reference - all flags, environment variables, YAML config, include/exclude patterns, auto-update
- MCP client integration - Claude Code, GitHub Copilot, Codex, OpenCode, Cursor, Gemini
- Embedding models - model choice, quantization, and hardware guidance
- Search and ranking - hybrid search pipeline, RRF fusion, and signal weighting
- Supported file types - extensions, special filenames, and document extractors
- Architecture - internal package layout and data flow
- Troubleshooting - common issues and fixes
Contributing
Fork, branch, add tests, submit a pull request.
License
MIT - see LICENSE.
Yorumlar (0)
Yorum birakmak icin giris yap.
Yorum birakSonuc bulunamadi