RepoBrain

mcp
Security Audit
Warn
Health Warn
  • License — License: MIT
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Low visibility — Only 7 GitHub stars
Code Pass
  • Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Pass
  • Permissions — No dangerous permissions requested
Purpose
This tool is a local-first codebase memory engine for AI coding assistants. It indexes repositories, retrieves relevant code context, and identifies the safest files to edit before an AI generates new code.

Security Assessment
Overall Risk: Low. The tool operates with a local-first design and does not request any inherently dangerous system permissions. A static code scan of its 12 files found no dangerous patterns, no hardcoded secrets, and no suspicious command execution routines. Because it is designed to run locally and integrate with AI tools via a standard stdio adapter, it does not appear to make unauthorized external network requests. Your codebase data remains on your machine.

Quality Assessment
Quality is solid from a maintenance perspective, but community trust is currently very low. The project was updated very recently (last push was today) and includes a clear MIT license, which is excellent for open-source freedom. However, it has extremely low community visibility with only 7 GitHub stars. While the documentation is highly detailed, the lack of widespread adoption means the code has not been thoroughly vetted by a large user base yet.

Verdict
Safe to use locally, though you should proceed with the typical caution applied to any new project with minimal community vetting.
SUMMARY

RepoBrain is a local-first codebase memory engine for AI coding assistants. It indexes repositories, retrieves grounded evidence, traces logic flows, and ranks the safest files to inspect or edit before code generation.

README.md

RepoBrain

alt text
RepoBrain is a local-first codebase memory engine for AI coding assistants. It indexes a repository, extracts symbols and lightweight dependency edges, combines lexical and semantic retrieval, and returns grounded evidence about where logic lives, how flows connect, and which files are safest to inspect before code is changed.

Overview

Read in English | Đọc bằng Tiếng Việt

English

RepoBrain exists to make coding agents less reckless.

Most AI coding failures do not start at code generation. They start earlier, when the agent reads the wrong files, misses a route-to-service or job-to-handler flow, or sounds confident without enough evidence.

RepoBrain focuses on that pre-generation step:

  • index the repository into local metadata, chunks, symbols, and edges
  • retrieve grounded evidence with BM25, embeddings, and reranking
  • trace likely flow across route, service, job, and config surfaces
  • rank edit targets with explicit rationale instead of hidden intuition
  • scan a repo and produce a concise project review with the most important risks first
  • lower confidence and emit warnings when evidence is weak or contradictory

The product is intentionally local-first and conservative. It ships as a CLI, a browser-based local UI, a local report/dashboard, and a stdio MCP-style adapter for tools such as Cursor, Codex, and Claude Code.

Tiếng Việt

RepoBrain được tạo ra để giúp AI coding assistant bớt "đoán mò" hơn.

Phần lớn lỗi của AI không bắt đầu ở bước sinh code, mà bắt đầu sớm hơn: đọc sai file, bỏ sót luồng route -> service -> job, hoặc trả lời rất tự tin dù bằng chứng còn mỏng.

RepoBrain tập trung vào đúng bước trước khi generate:

  • index repo thành metadata cục bộ, chunks, symbols và dependency edges
  • truy xuất bằng chứng có grounding bằng BM25, embedding và reranking
  • trace luồng khả dĩ giữa route, service, background job và config
  • xếp hạng file nên inspect hoặc edit với lý do rõ ràng
  • tự hạ confidence và cảnh báo khi bằng chứng yếu hoặc mâu thuẫn

Sản phẩm được thiết kế theo hướng local-first và thận trọng. RepoBrain hiện có CLI, giao diện web local, report/dashboard local, và adapter stdio MCP để gắn vào Cursor, Codex, Claude Code hoặc workflow agent tương tự.

Why RepoBrain

When coding agents fail, the root cause is usually not "bad code generation". It is bad context.

RepoBrain focuses on the step before generation:

  • Find the right files.
  • Surface the most relevant symbols and snippets.
  • Trace likely route -> service -> job flows.
  • Rank edit targets with evidence instead of intuition.
  • Downgrade confidence when the evidence is weak or contradictory.

This makes RepoBrain useful both as:

  • a CLI you can run locally against any repo
  • a stdio MCP-style tool adapter for Cursor, Codex, and Claude Code

What Ships In 0.1.x

  • Python package under src/repobrain
  • Local SQLite metadata store in .repobrain/metadata.db
  • Persisted vector index in .repobrain/vectors/
  • Python + TypeScript/JavaScript support
  • Hybrid retrieval with BM25 + local hash embeddings + reranking
  • Grounding harness with planner, retriever, file selector, evidence collector, edit-target scorer, and self-check
  • Markdown docs for architecture, CLI, MCP, config, evaluation, demos, releases, and run guides

Current Unreleased Track

This unreleased track now maps most closely to the 0.5.x integration line: it bundles provider adapters, provider smoke checks, a React browser UI, and release-facing diagnostics into one local workflow.

  • Interactive local chat loop through repobrain chat
  • Saved review baselines through repobrain baseline
  • Windows one-click launcher through chat.cmd
  • Windows one-click dashboard launcher through report.cmd
  • Human-readable terminal output through --format text
  • Local HTML status dashboard through repobrain report or repobrain report --open
  • Production ship gate through repobrain ship
  • Live provider access checks through repobrain provider-smoke
  • Active repo memory: run repobrain init --repo <path> once, then omit --repo
  • Local browser UI through repobrain serve-web --open
  • React TSX local browser UI with English/Vietnamese interface toggle, light/dark theme, and structured diagnostics cards
  • Concise repo scan through repobrain review --format text
  • Optional SDK-backed Gemini/OpenAI/Voyage/Cohere provider adapters
  • Gemini rerank model pools through GEMINI_MODELS with automatic failover on quota/rate-limit exhaustion
  • Optional tree-sitter parser adapter layer with heuristic fallback
  • Parser usage stats in repobrain index
  • Richer parser capability reporting in repobrain doctor

How To Run

Full bilingual installation instructions are available in docs/install.md.

Fast path for end users:

python -m venv .venv
. .venv/bin/activate
python -m pip install -e ".[dev,tree-sitter,mcp]"
repobrain init
repobrain review --format text
repobrain baseline --format text
repobrain index
repobrain query "Where is payment retry logic implemented?"
repobrain trace "Trace login with Google from route to service"
repobrain targets "Which files should I edit to add GitHub login?"
repobrain ship --format text
repobrain chat
repobrain report --format text
repobrain serve-web --open

From outside the target repo, initialize it once and then keep commands short:

repobrain init --repo "C:\path\to\your-project" --format text
repobrain review --format text
repobrain baseline --format text
repobrain index --format text
repobrain query "Where is payment retry logic implemented?" --format text
repobrain ship --format text
repobrain report --open

Or open the browser UI and import there:

repobrain serve-web --open

Then paste the project path and click Import + Index.
For the one-page audit flow, click Scan Project Review.
The browser UI now ships as a React TSX frontend with English/Vietnamese interface labels, a light/dark theme toggle, and structured doctor / provider-smoke diagnostics cards.

Windows PowerShell:

python -m venv venv
.\venv\Scripts\Activate.ps1
python -m pip install -e ".[dev,tree-sitter,mcp]"
repobrain init
repobrain review --format text
repobrain baseline --format text
repobrain index
repobrain doctor
repobrain query "Where is payment retry logic implemented?" --format text
repobrain ship --format text
repobrain report --format text

Run the MCP-style transport:

repobrain serve-mcp

On Windows, double-click chat.cmd for local chat or report.cmd for the visual dashboard. Both launchers prefer the project virtualenv and set PYTHONPATH=src.

See the full run guide in docs/run.md.

Frontend source for the browser UI lives in webapp/. The built local assets are generated into webapp/dist/, and repobrain serve-web serves that React build directly. If webapp/dist/ is missing, run npm run build inside webapp/ once before starting the Python web server.

There is also a separate human-friendly documentation frontend in docs-for-repobrain/ for onboarding, repo reading, and demo prep:

cd docs-for-repobrain
npm install
npm run dev

That app renders a curated command guide, release-state summary, and selected repo markdown files inside a modern light/dark docs UI.

CLI Surface

repobrain init
repobrain index
repobrain review
repobrain baseline
repobrain query "<question>"
repobrain trace "<question>"
repobrain impact "<question>"
repobrain targets "<question>"
repobrain benchmark
repobrain ship
repobrain doctor
repobrain provider-smoke
repobrain chat
repobrain report
repobrain report --open
repobrain demo-clean --format text
repobrain serve-web
repobrain quickstart
repobrain release-check --format text
repobrain serve-mcp

For human-friendly terminal output, add --format text to review, index, query, trace, impact, targets, benchmark, doctor, provider-smoke, or report. JSON remains the default for agents and automation.

For release validation, run repobrain release-check --format text before packaging, then repobrain release-check --require-dist --format text after python -m build to confirm wheel/sdist artifacts include the React frontend assets.

Before a live demo, run repobrain demo-clean --format text to remove local test/build clutter such as pytest_work_*, root dist/, and cache directories while preserving webapp/dist for repobrain serve-web.

Example Query Output

{
  "query": "Where is payment retry logic implemented?",
  "intent": "locate",
  "top_files": [
    {
      "file_path": "app/services/retry_handler.py",
      "language": "python",
      "role": "service",
      "score": 3.18,
      "reasons": ["bm25", "reranked", "symbol:enqueue_payment_retry"]
    },
    {
      "file_path": "app/jobs/payment_retry_job.py",
      "language": "python",
      "role": "job",
      "score": 2.74,
      "reasons": ["bm25", "reranked"]
    }
  ],
  "confidence": 0.77
}

Configuration

RepoBrain reads repobrain.toml from the repository root.

[project]
name = "RepoBrain"
repo_roots = ["."]
state_dir = ".repobrain"
context_budget = 12000

[indexing]
exclude = [
  ".git",
  ".venv",
  "venv",
  "__pycache__",
  ".pytest_cache",
  ".pytest_tmp",
  "pytest_tmp",
  "pytest_tmp_run",
  "pytest-cache-files-*",
  "node_modules",
  "dist",
  "build",
  ".repobrain",
]
chunk_max_lines = 80
chunk_overlap_lines = 12

[parsing]
prefer_tree_sitter = true
tree_sitter_languages = ["python", "typescript", "javascript"]

[providers]
embedding = "local"
reranker = "local"

Remote providers are opt-in. Install .[providers], set the relevant API key in .env, and explicitly change repobrain.toml before RepoBrain sends code to Gemini, OpenAI, Voyage, or Cohere.

Cheap Gemini setup:

[providers]
embedding = "gemini"
reranker = "gemini"
gemini_embedding_model = "gemini-embedding-001"
gemini_output_dimensionality = 768
gemini_task_type = "SEMANTIC_SIMILARITY"
gemini_rerank_model = "gemini-2.5-flash"
gemini_models = ["gemini-2.5-flash", "gemini-2.5-flash-lite", "gemini-3-flash-preview"]

gemini_rerank_model is not locked to a single Gemini release. RepoBrain passes the configured string straight to the Gemini SDK, so you can switch between current supported models such as gemini-2.5-flash, gemini-3-flash-preview, or gemini-2.5-flash-preview-09-2025.

If you want automatic fallback when one Gemini rerank model hits quota or rate limits, set GEMINI_MODELS in .env as a comma-separated ordered pool. RepoBrain will keep the first healthy model active and move to the next one only for quota/rate-limit exhaustion errors.

Start from .env.example and fill GEMINI_API_KEY.

Design Principles

  • Local-first by default
  • Pluggable providers for local or cloud inference
  • Evidence before edit suggestion
  • Degrade gracefully when tree-sitter or remote SDKs are unavailable
  • Keep the repo runnable without heavyweight infrastructure

Comparison To Naive AI Code Search

Capability Naive agent scan RepoBrain
File discovery heuristic guessing indexed hybrid retrieval
Flow tracing shallow grep symbol + import + call-edge hints
Edit target ranking implicit intuition explicit scored suggestions
Confidence rarely stated explicit score + warnings
Transport chat-only CLI + stdio MCP-style adapter

Docs

Release Track

  • 0.1.x: runnable MVP, local indexing, hybrid retrieval, edit targets
  • 0.2.x: parser quality upgrade, better graph extraction, stronger retrieval fusion
  • 0.3.x: confidence calibration, stronger impact analysis, safer change context
  • 0.5.x: provider adapters and richer MCP ergonomics
  • 1.0.0: trusted local codebase memory product with stable contracts

See the detailed breakdown in ROADMAP.md and docs/releases.md.

Status

This repository is a runnable MVP focused on clean architecture, testability, and a strong OSS launch narrative.

Reviews (0)

No results found