NeuronCite
Health Warn
- License — License: AGPL-3.0
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Low visibility — Only 8 GitHub stars
Code Pass
- Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Pass
- Permissions — No dangerous permissions requested
No AI report is available for this listing yet.
Local, privacy-preserving semantic document search engine with citation verification
Rust / Single Binary / Local-First / MCP-Native / Ollama-Powered
Your Documents. Your Machine. Your Answers.
NeuronCite is an enterprise-grade semantic search engine that transforms your
document library into an instantly searchable knowledge base -- and autonomously
verifies every citation in your academic papers using local LLMs. No cloud. No
API keys. Your data stays on your machine.
Written in Rust, it ships as a single binary (CPU) or minimal bundle (GPU) for
Windows, macOS, and Linux. After a one-time download of models and runtime
dependencies, the system operates fully offline, including in air-gapped and
classified environments. A built-in MCP server with 43 tools lets AI assistants
like Claude control the full workflow -- indexing, searching, verifying
citations, annotating PDFs -- directly from the chat interface.
Why NeuronCite
Privacy by Design -- All document processing runs locally. No user data
leaves the machine, no telemetry, no cloud accounts. No internet connection
required after initial setup (embedding models, pdfium, and optional
OCR/Ollama binaries). Network-dependent features (DOI resolution, HTML crawling,
citation source fetching) are unavailable in air-gapped mode. Supports
air-gapped and classified environments once all dependencies are provisioned.
Autonomous Citation Verification -- Feed a LaTeX paper and its bibliography.
NeuronCite indexes the cited PDFs, runs a local LLM via Ollama, and verifies
every \cite{} command against actual source material. Each citation receives a
verdict, confidence score, and correction suggestions.
Enterprise-Grade Architecture -- 16 Rust crates with clear separation of
concerns. 43 MCP tools, 34 REST API endpoints, a browser-based GUI with 7
tabs, a Python client library, and 11 CLI commands. CPU-only builds compile into
a single executable that runs without Docker, Kubernetes, or external infrastructure.
Citation verification additionally requires a running Ollama instance.
Quick Start
1. Download the binary for your platform from the
Releases page.
2. Double-click the binary. The application opens in a native window
(WebView2 on Windows, WebKit on macOS/Linux) -- no terminal, no configuration
required. Linux GUI builds require libwebkit2gtk-4.1-0 at runtime (see
Installation for details).
3. Select a directory of PDFs in the Indexing tab, choose an embedding
model, and click Start. The first run downloads the selected model
(50 MB--1 GB depending on model size).
Terminal alternative: Run
neuroncitefrom the command line, or index
directly withneuroncite index --directory ./papers.
Docker alternative:
docker run --gpus all -p 3030:3030 \ -v neuroncite-data:/data/Documents/NeuronCite \ ghcr.io/ff-tec/neuroncite:latest
Features
Hybrid Search
Combines three retrieval algorithms for high-precision document search:
- HNSW vector similarity -- Approximate nearest neighbor search over dense
embeddings for semantic matching - BM25 keyword matching -- SQLite FTS5 full-text index for exact term
matching - Reciprocal Rank Fusion -- Merges and normalizes scores from both retrieval
methods - Cross-encoder reranking -- Optional second-stage scoring with a
cross-encoder model for precision-critical queries - Sub-chunk refinement -- Configurable divisors for finer-grained passage
retrieval within chunks
Document Indexing
- PDF extraction with three backends: pdf-extract (pure Rust, default),
pdfium (multi-column layout support), and Tesseract OCR (image-heavy pages) - HTML crawling with BFS traversal, depth limiting, rate limiting, domain
filtering, regex URL patterns, and sitemap parsing - Four chunking strategies: page-based, word-window (fixed word count with
overlap), token-window (subword tokens), and sentence-based (respects citation
boundaries) - Eight embedding models from 33M to 335M parameters (384 to 4096
dimensions), downloaded from HuggingFace on first use - GPU acceleration via ONNX Runtime with CUDA 12.4, DirectML, CoreML, and
ROCm execution providers; CPU fallback on all platforms
Citation Verification
NeuronCite parses LaTeX papers for citation commands (\cite, \citep,\citet, \autocite, and variants), resolves them against BibTeX entries, and
verifies each claim against indexed source PDFs using a local LLM via Ollama.
Five-stage verification pipeline:
- Parse LaTeX, extract citation commands, resolve cite-keys against BibTeX
- Match bibliographic entries to PDFs via Jaro-Winkler similarity (threshold
0.80) - Group citations into batches for parallel processing, preserving co-citations
and section context - Verify each batch through the LLM with a minimum of 2 search queries per
citation, including cross-corpus search for alternative sources - Export 6 output files: annotation CSV, citation CSV/XLSX, corrections JSON,
citation report JSON, full detail JSON, and annotated PDFs
Seven verdict types:
| Verdict | Meaning |
|---|---|
supported |
Claim explicitly found in cited source with page numbers and passages |
partial |
Source supports part of the claim; other parts absent or unsupported |
unsupported |
PDF found and read but no passage supports the claim |
not_found |
Source PDF not in indexed corpus |
wrong_source |
Claim verifiable but evidence found in a different source |
unverifiable |
Future projections, subjective statements, or insufficiently specific claims |
peripheral_match |
Text found only in non-substantive sections (TOC, bibliography, appendix) |
Each verdict includes a confidence score (0.00--1.00), critical/warning flags
for contradictions and temporal mismatches, and correction suggestions with
types (rephrase, add context, replace citation).
PDF Annotation
Highlights text passages in PDFs with color-coded annotations based on
verification verdicts. Uses a 5-stage text matching pipeline:
- Exact byte-level match
- Normalized match with whitespace collapsing
- Fuzzy character-level match (string distance)
- Fallback extraction via multi-backend PDF text pipeline
- OCR fallback for scanned pages
Accepts annotation input as CSV or JSON. Supports comment annotations with
popup text alongside highlights.
Web Frontend (7 Tabs)
The SolidJS single-page application is compiled into the binary via rust-embed
and served at http://localhost:3030.
| Tab | Function |
|---|---|
| Sources | BibTeX management, web crawling (BFS with regex patterns, sitemap parsing), DOI resolution (Unpaywall, Semantic Scholar, OpenAlex), metadata extraction |
| Indexing | Directory selection, embedding model and chunking strategy configuration, real-time progress tracking, session management |
| Search | Multi-session hybrid search with vector/BM25/reranking toggles, sub-chunk refinement, grouped and flat result views, export as Markdown/BibTeX/CSL-JSON/RIS |
| Citations | LaTeX file selection with auto-detection of .bib files, Ollama model selection with connection test, verification mode presets (quick, balanced, thorough), live results with expandable verdicts |
| Annotations | CSV/JSON annotation input, source PDF directory selection, 5-stage text location pipeline, color configuration, per-quote progress tracking |
| Models | Embedding model catalog with download/activate controls, cross-encoder reranker management, Ollama LLM catalog, GPU/CUDA system info, model diagnostics |
| Settings | FTS5 index optimization, HNSW vector index rebuild, database reset, dependency detection (pdfium, Tesseract, Poppler), MCP server registration, real-time log streaming via SSE |
MCP Server (43 Tools)
The Model Context Protocol server exposes 43 tools for AI agent integration via
JSON-RPC 2.0 over stdio, organized in 8 categories:
| Category | Tools | Purpose |
|---|---|---|
| Session Management | 5 | List, delete, update, diff, discover index sessions |
| Indexing | 4 | Index directories, add files, reindex, preview chunks |
| Search & Retrieval | 8 | Search, batch search, multi-session search, compare, text search, content retrieval, export |
| File & Chunk Inspection | 4 | List files, inspect chunks, compare files, quality reports |
| Citation Verification | 8 | Create jobs, add claims, submit, check status, fetch rows, export, retry, fetch sources |
| Annotation | 4 | Annotate PDFs, check status, inspect annotations, remove annotations |
| Web Sources | 3 | Fetch HTML, crawl websites, check crawl status |
| System | 3 | List models, health check, log streaming |
REST API (34 Endpoints)
Axum-based HTTP server with OpenAPI specification (utoipa), bearer token
authentication with constant-time comparison, per-IP rate limiting, CORS, and
Server-Sent Events for real-time progress streaming. Endpoints cover all
functionality: sessions, indexing, search, citation, annotation, models, and
system health.
Python Client
Typed access to all REST endpoints with subprocess server management.
from neuroncite import NeuronCiteClient
client = NeuronCiteClient() # default: http://127.0.0.1:3030
results = client.search(session_id="...", query="capital asset pricing model")
for r in results:
print(f" [{r.score:.2f}] {r.citation}")
30+ typed methods covering search, indexing, citation, annotation, and model
management. See clients/python/README.md for the
full API reference.
Installation
Pre-built Binaries
Download the binary for your platform from the
Releases page.
Each release includes SHA-256 checksums for verification.
| Platform | Architecture | GUI | Artifact |
|---|---|---|---|
| Windows | x86_64 | Native window (WebView2) | neuroncite-windows-x64.exe |
| Linux | x86_64 | Native window (WebKit2GTK) | neuroncite-linux-x64 |
| Linux | x86_64 | Browser-only (headless) | neuroncite-linux-x64-server |
| Linux | ARM64 | Native window (WebKit2GTK) | neuroncite-linux-arm64 |
| Linux | ARM64 | Browser-only (headless) | neuroncite-linux-arm64-server |
| macOS | ARM64 (Apple Silicon) | Native window (WebKit) | neuroncite-macos-arm64 |
| macOS | x86_64 (Intel) | Native window (WebKit) | neuroncite-macos-x64 |
Linux GUI variants require libwebkit2gtk-4.1-0 at runtime.
Server variants have zero runtime dependencies beyond glibc.
Docker
Four image variants are published to the GitHub Container Registry.
See docker/README.md for compose profiles, environment
variables, and local build instructions.
# NVIDIA GPU (default)
docker run --gpus all -p 3030:3030 \
-v neuroncite-data:/data/Documents/NeuronCite \
ghcr.io/ff-tec/neuroncite:latest
# AMD ROCm
docker run --device=/dev/kfd --device=/dev/dri -p 3030:3030 \
-v neuroncite-data:/data/Documents/NeuronCite \
ghcr.io/ff-tec/neuroncite:latest-rocm
# CPU only
docker run -p 3030:3030 \
-v neuroncite-data:/data/Documents/NeuronCite \
ghcr.io/ff-tec/neuroncite:latest-cpu
| Variant | Base Image | GPU | Tag |
|---|---|---|---|
| NVIDIA | CUDA 12.4 + cuDNN, Ubuntu 22.04 | CUDA | latest / <version>-nvidia |
| ROCm | ROCm 6.4, Ubuntu 22.04 | ROCm | latest-rocm / <version>-rocm |
| CPU | Ubuntu 22.04, x86_64 | None | latest-cpu / <version>-cpu |
| CPU ARM64 | Ubuntu 22.04, ARM64 | None | latest-cpu-arm64 / <version>-cpu-arm64 |
From Source
Prerequisites: Rust 1.88+ (stable), Node 20+, npm.
git clone https://github.com/FF-TEC/NeuronCite.git
cd neuroncite
# Build the SolidJS frontend
cd crates/neuroncite-web/frontend && npm ci && npx vite build && cd ../../..
# Build the Rust binary (all features enabled by default)
cargo build --release -p neuroncite
The binary is at target/release/neuroncite (Linux/macOS) ortarget/release/neuroncite.exe (Windows).
For a server-only build without native GUI:
cargo build --release -p neuroncite \
--no-default-features \
--features backend-ort,web,mcp,pdfium,ocr
Python Client
# From PyPI (once published):
pip install neuroncite
# From source:
pip install ./clients/python
Requires Python 3.10+. See clients/python/README.md
for the full API reference.
Usage
Web UI
neuroncite
# or explicitly:
neuroncite web --port 3030
Starts the API server and opens the SolidJS web frontend in a native window
(WebView2 on Windows, WebKit on macOS/Linux). Falls back to the default browser
if the native window cannot be created. The frontend is served athttp://localhost:3030.
Headless Server
neuroncite serve --port 3030 --bind 0.0.0.0
Runs the REST API without opening a browser or native window. Suitable for
remote servers, Docker, and automation.
CLI Indexing
neuroncite index \
--directory /path/to/pdfs \
--model "BAAI/bge-small-en-v1.5" \
--strategy word \
--chunk-size 300 \
--overlap 50
Indexes a directory of PDFs without running a persistent server. Embedding
models are downloaded from HuggingFace on first use.
CLI Search
neuroncite search \
--directory /path \
--session-id 1 \
--query "heteroskedasticity-consistent covariance matrix" \
--hybrid \
--rerank
Executes a single search query against an existing session. Results are printed
as JSON (default) or plain text (--format text).
MCP Server (Claude Code & Claude Desktop App)
neuroncite mcp install # registers in Claude Code settings (default)
neuroncite mcp install --target claude-desktop # registers in Claude Desktop App settings
neuroncite mcp uninstall # removes registration from Claude Code settings
neuroncite mcp status # shows current registration status
neuroncite mcp serve # starts stdio JSON-RPC server
All Commands
| Command | Description |
|---|---|
neuroncite / neuroncite web |
Launch web UI in a native window (browser fallback) |
neuroncite serve |
Headless API server |
neuroncite index |
Index a directory of PDFs |
neuroncite search |
Execute search queries |
neuroncite annotate |
Annotate PDFs from CSV/JSON |
neuroncite doctor |
Check runtime dependencies (Tesseract, pdfium, GPU) |
neuroncite sessions |
List index sessions in a database |
neuroncite export |
Export results as Markdown, BibTeX, CSL-JSON, RIS, or plain text |
neuroncite models list|info|download|verify|system |
Manage embedding models and check system capabilities |
neuroncite mcp install|uninstall|serve|status |
Register, remove, run, and check MCP server |
neuroncite version |
Print version, build features, and Git commit hash |
Architecture
PDFs / HTML pages
|
v
Extract text
(pdf-extract / pdfium / Tesseract OCR / readability)
|
v
Chunk text
(page / word-window / token-window / sentence)
|
v
Embed chunks
(ONNX Runtime with CUDA / DirectML / CoreML / CPU)
|
v
Store vectors + metadata
(HNSW index + SQLite FTS5)
|
v
Hybrid search
(vector kNN + BM25 keyword + RRF fusion + optional reranking)
|
v
Ranked results with citations, page numbers, and scores
Cargo Workspace (16 Crates)
| Layer | Crate | Responsibility |
|---|---|---|
| Binary | neuroncite |
Entry point, CLI argument parsing (clap), execution mode dispatch |
| Presentation | neuroncite-web |
SolidJS frontend (rust-embed), native GUI (tao/wry), SSE broadcast |
| Presentation | neuroncite-api |
REST API server (Axum), 34 endpoints, OpenAPI, bearer auth, rate limiting, SSE |
| Presentation | neuroncite-mcp |
MCP server (43 tools, JSON-RPC 2.0 over stdio) |
| Domain | neuroncite-pipeline |
Background job executor, GPU worker with priority channels, two-phase indexing |
| Domain | neuroncite-search |
Hybrid search, BM25, Reciprocal Rank Fusion, deduplication, reranking |
| Domain | neuroncite-citation |
LaTeX/BibTeX parsing, batch claim extraction, LLM-driven verification |
| Domain | neuroncite-annotate |
PDF annotation with 5-stage text matching pipeline |
| Core | neuroncite-store |
SQLite storage (r2d2 pool), HNSW index, FTS5 full-text search, workflow tracking |
| Core | neuroncite-embed |
Dense embeddings via ONNX Runtime, model download and management, cross-encoder reranking |
| Core | neuroncite-pdf |
PDF discovery and text extraction (pdf-extract, pdfium, Tesseract OCR) |
| Core | neuroncite-html |
HTML fetching, readability extraction, caching, BFS crawling with SSRF protection |
| Core | neuroncite-chunk |
Text chunking strategies (page, word, token, sentence) |
| Core | neuroncite-llm |
LLM abstraction layer, Ollama HTTP client |
| Foundation | neuroncite-core |
Shared types, trait definitions, configuration, error types (zero internal dependencies) |
| Dev | neuroncite-testgen |
Test data generation and property-based testing utilities |
Feature Flags
| Flag | Purpose | Default |
|---|---|---|
backend-ort |
ONNX Runtime embedding backend with CUDA support | Enabled |
web |
SolidJS frontend embedded via rust-embed | Enabled |
gui |
Native window via tao/wry (requires web) |
Enabled |
mcp |
Model Context Protocol server for AI agent integration | Enabled |
pdfium |
Multi-column PDF extraction backend | Enabled |
ocr |
Tesseract OCR fallback for scanned pages (requires pdfium) |
Enabled |
For the full architecture document, see docs/architecture.pdf.
Hardware Requirements
| Component | Minimum | Recommended |
|---|---|---|
| RAM | 4 GB | 8 GB+ |
| CPU | 2 cores | 4+ cores |
| GPU | Not required | NVIDIA (CUDA 12.4+) or AMD (ROCm 6.4+) |
| Disk | 200 MB (binary) + model size | 2 GB+ for large collections |
Embedding model sizes range from 50 MB (bge-small, 33M parameters) to 1 GB
(large models, 335M parameters). GPU acceleration is optional -- the CPU
execution provider works on all platforms. The application runs entirely offline
after the initial model download.
Documentation
- neuroncite.com -- Project website with feature overview, FAQ, and roadmap
- REST API and CLI Reference -- Full endpoint documentation, Python client guide, MCP setup
- Pricing and Licensing -- AGPL-3.0 (free) and Enterprise license comparison
- Architecture Document -- Full system design (16 crates, data flow, design decisions)
- Docker Deployment -- Image variants, compose profiles, environment variables, local builds
- Python Client -- Full API reference with 30+ typed methods
- Tools and Scripts -- CI validators, code generators, developer utilities
- Commercial License -- Dual licensing details and use-case matrix
Contributing
Contributions are welcome. See CONTRIBUTING.md for
development setup, code style guidelines, testing instructions, and the pull
request process.
Please read the Code of Conduct before participating.
Security
To report a vulnerability, see SECURITY.md for the disclosure
process and response timeline.
License
NeuronCite is dual-licensed:
- AGPL-3.0-only -- free copyleft license for any use, including commercial.
Requires source disclosure when distributing or providing network access.
See LICENSE. - Commercial license for proprietary products, SaaS deployments, and
redistribution without AGPL source-disclosure obligations.
See COMMERCIAL_LICENSE.md.
For commercial licensing inquiries: [email protected]
Copyright (C) 2026 Felix Fritz. All rights reserved.
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found