MetaSearchMCP
Health Warn
- License — License: MIT
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Low visibility — Only 8 GitHub stars
Code Pass
- Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Pass
- Permissions — No dangerous permissions requested
This tool is an open-source metasearch backend and MCP server that aggregates search results from multiple engines (like DuckDuckGo, Brave, and hosted Google APIs). It normalizes the results into a predictable JSON format designed specifically for AI agents and LLM workflows.
Security Assessment
Overall Risk: Low. The automated code scan of 12 files found no dangerous patterns, hardcoded secrets, or requests for dangerous system permissions. The tool's primary function requires making external network requests to various search engine APIs and performing HTML scraping. Because it acts as a gateway, it requires you to provide your own API keys (via standard environment variables like `SERPBASE_API_KEY`) for certain providers. It does not appear to access unrelated sensitive local data or execute arbitrary shell commands.
Quality Assessment
The project has a solid foundation. It is actively maintained, with its most recent push occurring today. It clearly documents its purpose and architecture, and uses the permissive MIT license. However, community visibility and trust are currently very low. With only 8 GitHub stars, it remains a very early-stage or niche project, meaning it has not been broadly tested or vetted by the open-source community.
Verdict
Safe to use, though administrators should practice standard caution by safely securing their third-party API keys.
Open-source metasearch backend, MCP server, and AI search API for LLM agents. Python FastAPI search gateway with Google search via SerpBase and Serper, multi-engine search aggregation, structured JSON output, provider fallback, deduplication, and SearXNG alternative architecture for agent workflows.
MetaSearchMCP
Open-source metasearch backend for MCP, AI agents, and LLM workflows.
MetaSearchMCP aggregates results from multiple search providers, normalizes them into a stable JSON schema, and exposes both an HTTP API and an MCP server for agent tooling.
Positioning
- MCP-first metasearch backend
- Structured search API for AI pipelines
- Multi-provider search orchestration with deduplication and fallback
- Python FastAPI alternative to browser-first metasearch projects
Why It Exists
Most search aggregators are designed around browser UX: HTML pages, pagination, and interactive result cards. Agents and LLM workflows need a different contract: predictable JSON, stable field names, partial-failure tolerance, and provider-level execution metadata.
MetaSearchMCP is built for that machine-consumable workflow. It is not a SearXNG clone. The design is centered on search orchestration, normalized contracts, and MCP integration.
Core Features
- Concurrent multi-provider aggregation
- Unified result schema for web, academic, developer, and knowledge sources
- Provider-level timeout isolation and partial-failure handling
- Result deduplication across engines
- Provider selection by explicit names or semantic tags such as
web,academic,code, andgoogle - Final result caps for agent-friendly payload sizing
- HTTP API with OpenAPI docs
- MCP server over stdio for Claude Desktop, Cline, Continue, and similar clients
- Configurable provider allowlist via environment variables
Google Support
Google is intentionally not scraped directly in this project.
In practice, Google's anti-bot and risk-control systems make self-hosted scraping brittle and expensive to maintain. For a backend intended for reliable MCP and AI workloads, hosted Google providers are the more practical option.
Currently supported Google providers:
| Provider | Env var | Notes |
|---|---|---|
| serpbase.dev | SERPBASE_API_KEY |
Pay-per-use; typically cheaper for low-volume usage |
| serper.dev | SERPER_API_KEY |
Includes a free tier, then pay-per-use |
Both are low-cost options. For smaller or occasional workloads, serpbase.dev is usually the lower-cost choice.
Supported Providers
| Provider | Name | Method |
|---|---|---|
| SerpBase | google_serpbase |
Hosted Google SERP API |
| Serper | google_serper |
Hosted Google SERP API |
Web Search
| Provider | Name | Method |
|---|---|---|
| DuckDuckGo | duckduckgo |
HTML scraping |
| Bing | bing |
RSS feed |
| Yahoo | yahoo |
HTML scraping, best effort |
| Brave | brave |
Official Search API |
| Mwmbl | mwmbl |
Public JSON API |
| Ecosia | ecosia |
HTML scraping |
| Mojeek | mojeek |
HTML scraping |
| Startpage | startpage |
HTML scraping, best effort |
| Qwant | qwant |
Internal JSON API, best effort |
| Yandex | yandex |
HTML scraping, best effort |
| Baidu | baidu |
JSON endpoint, best effort |
Knowledge And Reference
| Provider | Name | Method |
|---|---|---|
| Wikipedia | wikipedia |
MediaWiki API |
| Wikidata | wikidata |
Wikidata API |
| Internet Archive | internet_archive |
Advanced Search API |
| Open Library | openlibrary |
Open Library search API |
Developer Sources
| Provider | Name | Method |
|---|---|---|
| GitHub | github |
GitHub REST API |
| GitLab | gitlab |
GitLab REST API |
| Stack Overflow | stackoverflow |
Stack Exchange API |
| Hacker News | hackernews |
Algolia HN API |
reddit |
Reddit API | |
| npm | npm |
npm registry API |
| PyPI | pypi |
HTML scraping |
| RubyGems | rubygems |
RubyGems search API |
| crates.io | crates |
crates.io API |
| lib.rs | lib_rs |
HTML scraping |
| Docker Hub | dockerhub |
Docker Hub search API |
| pkg.go.dev | pkg_go_dev |
HTML scraping |
| MetaCPAN | metacpan |
MetaCPAN REST API |
Academic Sources
| Provider | Name | Method |
|---|---|---|
| arXiv | arxiv |
Atom API |
| PubMed | pubmed |
NCBI E-utilities |
| Semantic Scholar | semanticscholar |
Graph API |
| CrossRef | crossref |
REST API |
Finance Sources
| Provider | Name | Key Required | Free Tier |
|---|---|---|---|
| Yahoo Finance | yahoo_finance |
No | Unofficial endpoint, no key needed |
| Alpha Vantage | alpha_vantage |
ALPHA_VANTAGE_API_KEY |
25 req/day — get key |
| Finnhub | finnhub |
FINNHUB_API_KEY |
60 req/min — get key |
Installation
git clone https://github.com/gefsikatsinelou/MetaSearchMCP
cd MetaSearchMCP
pip install -e ".[dev]"
Or with uv:
uv pip install -e ".[dev]"
Configuration
Copy .env.example to .env and configure any providers you want to enable.
cp .env.example .env
Key settings:
HOST=0.0.0.0
PORT=8000
DEFAULT_TIMEOUT=10
AGGREGATOR_TIMEOUT=15
SERPBASE_API_KEY=
SERPER_API_KEY=
BRAVE_API_KEY=
GITHUB_TOKEN=
STACKEXCHANGE_API_KEY=
REDDIT_CLIENT_ID=
REDDIT_CLIENT_SECRET=
NCBI_API_KEY=
SEMANTIC_SCHOLAR_API_KEY=
ALPHA_VANTAGE_API_KEY=
FINNHUB_API_KEY=
ENABLED_PROVIDERS=
ALLOW_UNSTABLE_PROVIDERS=false
MAX_RESULTS_PER_PROVIDER=10
Running
HTTP API
python -m metasearchmcp.server
# or
metasearchmcp
The API starts on http://localhost:8000.
MCP Server
python -m metasearchmcp.broker
# or
metasearchmcp-mcp
The MCP server communicates over stdio.
Docker
docker build -t metasearchmcp .
docker run --rm -p 8000:8000 --env-file .env metasearchmcp
Or with Compose:
docker compose up --build
HTTP API
POST /search
Aggregate across all enabled providers or a selected provider subset.
curl -X POST http://localhost:8000/search \
-H "Content-Type: application/json" \
-d '{
"query": "rust async runtime",
"providers": ["duckduckgo", "wikipedia"],
"params": {"num_results": 5, "max_total_results": 8, "language": "en"}
}'
You can also narrow providers by tags:
curl -X POST http://localhost:8000/search \
-H "Content-Type: application/json" \
-d '{
"query": "transformer attention",
"tags": ["academic", "knowledge"],
"params": {"num_results": 5, "max_total_results": 6}
}'
num_results controls how many results each provider can contribute. max_total_results caps the final merged response after deduplication.
POST /search/google
Search Google through a configured hosted provider.
curl -X POST http://localhost:8000/search/google \
-H "Content-Type: application/json" \
-d '{"query": "site:github.com rust tokio"}'
GET /providers
Return the currently available provider catalog.
The response includes provider descriptions and a tag-to-provider index for quick discovery.
You can filter the catalog by tag:
curl "http://localhost:8000/providers?tag=academic&tag=web"
GET /health
Simple health check endpoint. Returns service status, version, provider count, and the current provider name list.
Response Schema
Every aggregated response includes:
enginequeryresultsrelated_searchessuggestionsanswer_boxtiming_msproviderserrors
Every result item includes:
titleurlsnippetsourcerankproviderpublished_dateextra
Example response:
{
"engine": "metasearchmcp",
"query": "rust async runtime",
"results": [
{
"title": "Tokio - An asynchronous Rust runtime",
"url": "https://tokio.rs",
"snippet": "Tokio is an event-driven, non-blocking I/O platform...",
"source": "tokio.rs",
"rank": 1,
"provider": "duckduckgo",
"published_date": null,
"extra": {}
}
],
"related_searches": [],
"suggestions": [],
"answer_box": null,
"timing_ms": 843.2,
"providers": [
{
"name": "duckduckgo",
"success": true,
"result_count": 10,
"latency_ms": 840.1,
"error": null
}
],
"errors": []
}
MCP Tools
MetaSearchMCP exposes these MCP tools:
search_websearch_googlesearch_academicsearch_githubcompare_engines
search_web also accepts optional tags so agents can limit search to categories such as web, academic, code, or google.
All search tools accept max_total_results to keep the final payload compact.
Example Claude Desktop config:
{
"mcpServers": {
"MetaSearchMCP": {
"command": "metasearchmcp-mcp",
"env": {
"SERPBASE_API_KEY": "your_key",
"SERPER_API_KEY": "your_key"
}
}
}
}
Development
pip install -e ".[dev]"
pytest
uvicorn metasearchmcp.server:app --reload
Architecture
The public package is organized around these modules:
contracts.py: request and response modelscatalog.py: provider discovery and selectionorchestrator.py: concurrent search execution and response assemblymerge.py: URL normalization and deduplicationserver.py: FastAPI entrypointbroker.py: MCP entrypoint
Legacy module names are kept as compatibility shims for earlier imports.
Roadmap
- Caching and provider-aware query reuse
- Better scoring and ranking signals across providers
- Streaming aggregation responses
- Provider health telemetry
- More first-party API integrations where they improve reliability
License
MIT
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found