Mobus

mcp
Security Audit
Fail
Health Pass
  • License — License: MIT
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Community trust — 13 GitHub stars
Code Fail
  • exec() — Shell command execution in src/adapters/arxiv.ts
  • network request — Outbound network request in src/adapters/arxiv.ts
  • network request — Outbound network request in src/adapters/aws.ts
  • network request — Outbound network request in src/adapters/eurostat.ts
  • process.env — Environment variable access in src/adapters/google.ts
  • process.env — Environment variable access in src/adapters/huggingface.ts
  • process.env — Environment variable access in src/adapters/kaggle.ts
Permissions Pass
  • Permissions — No dangerous permissions requested
Purpose
This MCP server acts as a unified search connector, allowing AI assistants to find, preview, and analyze datasets across over 20 external platforms. It is designed to aggregate data queries seamlessly into tools like Claude.

Security Assessment
Overall Risk: Medium. The tool makes outbound network requests to various external data sources (such as ArXiv, AWS, and Eurostat), which is expected for its purpose. It also requires environment variables to handle API keys for platforms like Google, HuggingFace, and Kaggle, which is standard but requires careful handling on your part. However, the audit flagged a failed check for shell command execution in the ArXiv adapter. Executing shell commands can potentially lead to injection vulnerabilities depending on how it is implemented, so you should manually verify the code in that specific file before deploying. No hardcoded secrets or dangerous permissions were detected.

Quality Assessment
The project has a solid foundation: it is written in TypeScript, uses the permissive MIT license, and is actively maintained (with pushes as recent as today). Community trust is currently minimal given its low star count (13), but the provided documentation is highly detailed, professional, and clear.

Verdict
Use with caution — it is an actively maintained and permissively licensed tool, but you should inspect the ArXiv adapter's shell execution implementation to ensure it meets your safety standards before integrating.
SUMMARY

Search, preview, and analyze datasets from 20+ platforms via a single MCP connector. Works with Claude instantly.

README.md

Mobus

Dataset search for AI assistants
Discover, preview, and analyze datasets across 20 platforms from a single conversation.

License: MIT
Node.js
MCP
TypeScript


Connect to Claude

Add Mobus to Claude in under a minute. No install, no API keys, nothing to run.

  1. Open claude.ai (or Claude Desktop / Mobile)
  2. Go to Settings from the bottom left → Connectors
  3. Click Add custom connector
  4. Name it Mobus and paste this URL:
https://mobus-production.up.railway.app/mcp
  1. Start a new chat and try:

"Search for air quality datasets with a commercial license"

That's it. All 15 tools are available immediately.



Mobus workflow: 5 stages, 15 tools

What it does

Just ask your AI assistant.

"Search for air-quality datasets with a commercial license"
"Preview the first 20 rows of that Zenodo dataset"
"Find SEC filings mentioning climate risk"
"Generate an APA citation for that Hugging Face dataset"
"Check if this dataset can be used commercially"
"Visualize that dataset"

Mobus fans requests out to every configured platform in parallel, checks licenses, previews data, generates citations, and traces academic lineage — failing gracefully whenever an API key is missing.


Tools

Discovery

  • search_datasets — search all 20 platforms at once
  • find_research_datasets — datasets used in papers
  • find_similar — datasets similar to one you have

Evaluation

  • get_dataset_details — full metadata
  • preview_dataset — first N rows
  • compare_datasets — 2-5 side by side

Quality & Compliance

  • assess_quality — missing values, duplicates, stats
  • check_license — commercial / academic / internal
  • check_compatibility — schema match against yours

Citation & Output

  • generate_citation — APA, BibTeX, Chicago
  • visualize_dataset (Only works locally - ask Claude to generate an artifact). — interactive ECharts dashboard
  • watch_query — monitor for new datasets

Advanced Research

  • get_dataset_provenance — introducing paper & history
  • get_dataset_lineage — variants & derivatives
  • trace_citation_graph — citation chain analysis

Supported platforms

No auth needed

  • data.gov
  • Zenodo
  • OpenML
  • UCI ML Repository
  • AWS Open Data
  • World Bank
  • WHO GHO
  • NASA Earthdata
  • Eurostat
  • arXiv
  • Census.gov
  • SEC EDGAR
  • Crossref

Optional auth

  • Hugging Face (faster w/ token)
  • Socrata (faster w/ token)
  • Semantic Scholar

Requires key

  • Kaggle
  • Google Dataset Search

Degraded

  • Papers with Code (API shut down)
  • Econdb (now requires key)

Missing keys automatically skip that platform. The server never crashes.


Run locally (optional)

If you prefer to self-host instead of using the hosted version above:

git clone https://github.com/hrantvirabyan/Mobus.git
cd Mobus
npm install
cp .env.example .env   # fill in any keys you have (all optional)
npm run build

Cursor

Add to ~/.cursor/mcp.json:

{
  "mcpServers": {
    "mobus": {
      "command": "node",
      "args": ["/absolute/path/to/Mobus/dist/main.js"]
    }
  }
}

Restart Cursor. All 15 tools appear in the chat.

Claude Desktop

Same config format, in ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows).

API keys (all optional)

Variable For Where
KAGGLE_USERNAME / KAGGLE_KEY Kaggle kaggle.com/account → API
HF_TOKEN Hugging Face huggingface.co/settings/tokens
GOOGLE_API_KEY / GOOGLE_CSE_ID Google console.cloud.google.com
SOCRATA_APP_TOKEN Socrata dev.socrata.com/register

Tool reference

Click to expand full parameter reference

search_datasets

Parameter Type Default Description
query string required Search query
sources string[] all Platforms to include
limit number 5 Results per source (max 20)
license string e.g. cc-by-4.0
format string e.g. csv, parquet
updated_after string ISO date
modality string e.g. tabular, image

get_dataset_details

Parameter Type Description
source string Platform
dataset_id string Dataset ID

preview_dataset

Parameter Type Default Description
source / dataset_id string required
rows number 10 Max 100

visualize_dataset (Only works locally - ask Claude to generate an artifact)

Generates an interactive ECharts dashboard with column picker, filter builder, row range selector, 9 chart types, sortable table, and PNG/SVG/CSV/JSON export.

Parameter Type Default Description
source / dataset_id string required
rows number 200 Max 500
open boolean true Auto-open browser

compare_datasets

Parameter Type Description
datasets array 2-5 {source, dataset_id} objects

check_compatibility

Parameter Type Description
source / dataset_id string
schema array [{name, type?}]

find_similar

Parameter Type Default Description
source / dataset_id string required
limit number 5 Max 20

generate_citation

Parameter Type Default Description
source / dataset_id string required
format string apa bibtex, apa, chicago

assess_quality

Parameter Type Default Description
source / dataset_id string required
sample_rows number 100 Max 500

check_license

Parameter Type Description
source / dataset_id string
use_case string commercial / academic / internal / redistribution

watch_query

Parameter Type Description
action string add / remove / list / check
query / sources / watch_id See action

find_research_datasets

Parameter Type Default Description
query string required Research topic
limit number 10 Max 20
semantic boolean false SPECTER v2 embeddings

get_dataset_provenance / trace_citation_graph / get_dataset_lineage

Currently degraded — depend on the Papers with Code API which has shut down.


Known issues

  • Papers with Code API shut down post-HuggingFace acquisition — lineage/provenance/citation-graph tools return errors
  • Econdb now requires a key — returns empty until support is added
  • arXiv rate-limits under heavy parallel load (adapter uses 3s throttle)

Contributing

See CONTRIBUTING.md. If Mobus saves you time, a GitHub star helps others find it.

License

MIT — see LICENSE.

Reviews (0)

No results found