rag-forge
Health Warn
- License — License: MIT
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Low visibility — Only 5 GitHub stars
Code Pass
- Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Pass
- Permissions — No dangerous permissions requested
This is an MCP server and CLI toolkit designed to scaffold production-grade Retrieval-Augmented Generation (RAG) pipelines. It provides a framework to build, evaluate, and score the maturity of RAG systems using continuous integration gates and a built-in maturity model.
Security Assessment
Based on a light code scan of 12 files, no dangerous execution patterns, hardcoded secrets, or risky permission requests were found. However, given the nature of the tool, it inherently interacts with documents and language models. Developers should anticipate that it makes external network requests to fetch dependencies and communicate with LLM APIs (such as OpenAI or Anthropic), which means your documents and queries will leave your local environment. Overall risk is rated as Low for the code itself, provided you manage your external API keys securely.
Quality Assessment
The project is highly transparent and actively maintained, with its most recent push occurring today. It is protected by a permissive MIT license and features a comprehensive README with clear documentation, external websites, and CI/CD workflows. The main drawback is extremely low community visibility. It currently has only 5 stars, which means it has not yet been broadly tested or vetted by the open-source community. Developers should expect to rely primarily on the original creator for support and updates rather than community crowdsourcing.
Verdict
Safe to use, but exercise standard caution regarding external API data transfers and expect limited community support.
Production-grade RAG pipelines with evaluation baked in
RAG-Forge
Production-grade RAG pipelines with evaluation baked in — not bolted on after deployment.
Docs · Website · Discussions · Changelog
Why RAG-Forge?
Most RAG projects ship without evaluation, and most evaluation libraries don't help you build the pipeline. Few tools score maturity end-to-end — so teams often don't know if they're at "a demo that sometimes works" or "a system you can put in front of customers."
- Building a RAG pipeline is easy. Knowing whether it works is hard. RAG-Forge closes that loop.
- Eval is a first-class citizen, not an afterthought. Every template ships with a golden set and an audit gate.
- The RAG Maturity Model (RMM-0 → RMM-5) gives you a concrete scorecard for any RAG system — yours or someone else's.
RAG-Forge is one of the few toolkits that scaffolds production-ready RAG pipelines, runs continuous evaluation as a CI/CD gate, and scores any existing system against a published maturity model — all in one CLI.
RAG Maturity Model
The RMM is the scoring framework at the heart of RAG-Forge. Run rag-forge assess on any audit report to see where your system sits.
| Level | Name | Exit Criteria |
|---|---|---|
| RMM-0 | Naive | Basic vector search works |
| RMM-1 | Better Recall | Hybrid search, Recall@5 > 70% |
| RMM-2 | Better Precision | Reranker active, nDCG@10 +10% |
| RMM-3 | Better Trust | Guardrails, faithfulness > 85% |
| RMM-4 | Better Workflow | Caching, P95 < 4s, cost tracking |
| RMM-5 | Enterprise | Drift detection, CI/CD gates, adversarial tests |
Quick Start
npm install -g @rag-forge/cli
# Scaffold a project (use --directory to name the folder)
rag-forge init basic --directory my-rag-project
cd my-rag-project
# Drop your documents into a folder of your choice (or use the example below)
mkdir docs
echo "RAG-Forge is a CLI for building and evaluating RAG pipelines." > docs/example.md
rag-forge index --source ./docs
rag-forge audit --golden-set eval/golden_set.json
rag-forge assess --audit-report reports/audit-report.json
From empty directory to a scored RAG system with a golden set and an audit report — in under a minute.
Installation
CLI (Node.js 20+):
npm install -g @rag-forge/cli
Python packages (Python 3.11+):
pip install rag-forge-core rag-forge-evaluator rag-forge-observability
Templates
| Template | Use Case |
|---|---|
basic |
First RAG project, simple Q&A |
hybrid |
Production-ready document Q&A with reranking |
agentic |
Multi-hop reasoning with query decomposition |
enterprise |
Regulated industries with full security suite |
n8n |
AI automation agency deployments |
Templates generate editable source code in your project — not framework dependencies. Fork the code, not the abstraction.
Commands
| Category | Commands |
|---|---|
| Scaffolding | init, add |
| Ingestion | parse, chunk, index |
| Query | query, inspect |
| Evaluation | audit, assess, golden add, golden validate |
| Operations | report, cache stats, drift report, cost |
| Security | guardrails test, guardrails scan-pii |
| Integration | serve --mcp, n8n export |
Run rag-forge --help for the full command reference.
How RAG-Forge compares
There are great tools in this space. Here's an honest look at where each fits.
| Capability | RAG-Forge | RAGAS | LangChain Eval | Giskard |
|---|---|---|---|---|
| Scaffolds a RAG pipeline | ✓ | — | — | — |
| Evaluation metrics | ✓ | ✓ | ✓ | ✓ |
| Maturity scoring (RMM-0 → 5) | ✓ | — | — | — |
| CI gate workflow (audit action) | ✓ | — | partial | partial |
| MCP server | ✓ | — | — | — |
| Guardrails / PII scanning | ✓ | — | partial | ✓ |
| Drift detection | ✓ | — | — | partial |
| Multi-language (TS + Python) | ✓ | — | ✓ | — |
| Framework-agnostic | ✓ | ✓ | — | ✓ |
Peer strengths worth knowing:
- RAGAS has deeper metric research and a large community. RAG-Forge's evaluator supports RAGAS as a backend — run
rag-forge audit --evaluator ragasto use it directly. - LangChain Eval has the broadest ecosystem of integrations if you're already invested in LangChain.
- Giskard has a strong general-purpose ML testing story beyond RAG.
Pick the tool that matches your stage. RAG-Forge's wedge is the full lifecycle — scaffold → evaluate → score → ship — in one CLI, with the RMM as the objective function.
Architecture
RAG-Forge is a polyglot monorepo. The CLI and MCP server are TypeScript; all RAG logic is Python. The CLI delegates to Python via a subprocess bridge so the two halves can be developed and versioned independently.
rag-forge/
├── packages/
│ ├── cli/ TypeScript — Commander.js CLI (rag-forge command)
│ ├── mcp/ TypeScript — MCP server (@modelcontextprotocol/sdk)
│ ├── core/ Python — RAG pipeline primitives
│ ├── evaluator/ Python — RAGAS + DeepEval + LLM-as-Judge
│ └── observability/ Python — OpenTelemetry + Langfuse
├── templates/ Project templates (basic, hybrid, agentic, enterprise, n8n)
└── apps/site/ Docs and marketing site (Next.js, deployed to Vercel)
See docs/architecture.md for a deeper dive.
Docs & Community
- 📚 Docs: https://rag-forge-docs.vercel.app/
- 🌐 Website: https://rag-forge-site.vercel.app/
- 💬 Discussions: https://github.com/hallengray/rag-forge/discussions
- 🔒 Security: see SECURITY.md
- 📝 Changelog: docs/release-notes
Contributing
See CONTRIBUTING.md for development setup and contribution guidelines. All contributors are expected to follow our Code of Conduct.
License
MIT — see LICENSE
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found