ocr-mcp
Health Gecti
- License — License: MIT
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Community trust — 14 GitHub stars
Code Basarisiz
- rm -rf — Recursive force deletion command in frontend/package.json
- network request — Outbound network request in frontend/package.json
Permissions Gecti
- Permissions — No dangerous permissions requested
Bu listing icin henuz AI raporu yok.
FastMCP server providing advanced OCR capabilities with current state-of-the-art models (DeepSeek-OCR, Florence-2, DOTS.OCR, PP-OCRv5, Qwen-Image-Layered decomposition), WIA scanner control, and multi-format document processing for PDFs, CBZ comics, and images.
OCR-MCP
Complete AI OCR webapp and MCP server. A web app for people (draganddrop OCR, scanner, batch) and a FastMCP 3.1 MCP server for agentic IDEsClaude, Cursor, Windsurfso agents can run OCR, preprocessing, and workflows as tools. Same 10+ engines, WIA scanner (Windows), and pipelines; one repo.
Topics: ocr, mcp, fastmcp, document-processing, scanner, wia, pdf, computer-vision, model-context-protocol, llm
What it does
- Web app React (
web_sota/) + FastAPI (backend/app.py): upload or scan, pick engine, get text/PDF/JSON. Ports 10858 (Vite) and 10859 (API). In-app Help (/help) documents the web UI, the MCP server, and OCR backends. - MCP server FastMCP 3.1 stdio: tools for OCR, preprocessing, scanner, workflows. Sampling defaults to local Ollama (
http://127.0.0.1:11434/v1, modelllama3.2) no cloud API key. SetOCR_SAMPLING_USE_CLIENT_LLM=1to use the host IDEs LLM instead. Mistral OCR usesMISTRAL_API_KEYwhen you call that backend. See AI_FEATURES.md.
Features: 10+ backends (PaddleOCR-VL-1.5, DeepSeek-OCR-2, Mistral OCR, ) Auto backend selection Preprocessing (deskew, enhance, crop) Layout & table extraction Quality assessment WIA scanner Batch & pipelines Multi-format export
Docs
| Doc | Description |
|---|---|
| Install | Install, run MCP, Web UI (start.ps1, ports 10858/10859), PyYAML notes, client config |
| Backend deps | Web FastAPI backend: same venv as ocr-mcp, pyproject.toml, PyTorch, OCR_AUTO_INSTALL_DEPS |
| Technical | Architecture, tools, config, development, packaging |
| OCR models | Engines, capabilities, hardware (see also AI_MODELS.md) |
| Backend requirements | Per-model pip packages, system deps, env/config |
| MCP toolset matrix | Portmanteau tools, operation status, corpus v0 |
| AI features | Sampling, SEP-1577, agentic workflows, prompts |
| In-app Help | Source for /help: webapp vs MCP vs backends (mirrors INSTALL / TECHNICAL) |
| SOTA Compliance | Verified SOTA v12.0 Architecture |
Also: JUSTFILE.md (just recipes) OCR-MCP_MASTER_PLAN.md (roadmap) tests/README.md (testing)
Install
Repository: github.com/sandraschi/ocr-mcp. Clone first uv sync needs a project on disk:
git clone https://github.com/sandraschi/ocr-mcp.git
Set-Location ocr-mcp
uv sync
Quick start
uv sync
just run
Web UI (recommended): from repo root run web_sota\start.ps1 (PowerShell). It clears ports 10858/10859, runs uv sync, restores PyYAML if needed (see docs/INSTALL.md), starts the FastAPI backend in a new window, starts Vite in another window, then opens http://localhost:10858 in your browser.
Or: just webapp if your justfile wraps the same flow.
If the start script fails, use two terminals from the ocr-mcp repo root:
- Terminal 1 (backend):
$env:PYTHONPATH = (Get-Location).Path; uv run uvicorn backend.app:app --host 127.0.0.1 --port 10859 - Terminal 2 (frontend):
cd web_sota; npm run dev -- --port 10858 --host
Then open http://localhost:10858
Tests: uv sync --extra dev then uv run python -m pytest or python scripts/run_tests.py --suite quick. See tests/README.md.
🛡️ Industrial Quality Stack
This project adheres to SOTA 14.1 industrial standards for high-fidelity agentic orchestration:
- Python (Core): Ruff for linting and formatting. Zero-tolerance for
printstatements in core handlers (T201). - Webapp (UI): Biome for sub-millisecond linting. Strict
noConsoleLogenforcement. - Protocol Compliance: Hardened
stdout/stderrisolation to ensure crash-resistant JSON-RPC communication. - Automation: Justfile recipes for all fleet operations (
just lint,just fix,just dev). - Security: Automated audits via
banditandsafety.
License
MIT see LICENSE.
Yorumlar (0)
Yorum birakmak icin giris yap.
Yorum birakSonuc bulunamadi