Semantic Scholar MCP Server

A comprehensive 14-tool MCP server for Semantic Scholar academic research workflows. Direct access to 200M+ papers from Semantic Scholar — paper search, citation graph traversal, author profiles, and recommendations — from any Model Context Protocol client (e.g., Claude Desktop, Claude Code, Cursor, Cline, Continue, and others).

Installation

Option 1: One-Line Install (Recommended)

# No cloning needed — runs directly from PyPI
uvx s2-mcp-server

Option 2: Claude Code

claude mcp add semantic-scholar -- uvx s2-mcp-server

Option 3: Claude Desktop (Windows)

Add to %APPDATA%\Claude\claude_desktop_config.json:

{
  "mcpServers": {
    "semantic-scholar": {
      "command": "uvx",
      "args": ["s2-mcp-server"],
      "env": {
        "SEMANTIC_SCHOLAR_API_KEY": "your-key-here"
      }
    }
  }
}

Option 4: Claude Desktop (macOS)

Add to ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "semantic-scholar": {
      "command": "uvx",
      "args": ["s2-mcp-server"],
      "env": {
        "SEMANTIC_SCHOLAR_API_KEY": "your-key-here"
      }
    }
  }
}

Option 5: pip / From Source

pip install s2-mcp-server
# or
git clone https://github.com/smaniches/semantic-scholar-mcp.git
cd semantic-scholar-mcp && pip install -e .

Option 6: Docker

docker pull ghcr.io/smaniches/semantic-scholar-mcp:latest
docker run -e SEMANTIC_SCHOLAR_API_KEY=your-key ghcr.io/smaniches/semantic-scholar-mcp

Note: Get a free API key at semanticscholar.org/product/api. Without a key, you get rate-limited public access (1 req/sec).

Architecture

flowchart LR
  Client["MCP client<br/>(Claude Desktop, Claude Code,<br/>Cursor, Cline, Continue, …)"]
  subgraph Server ["s2-mcp-server (this package)"]
    direction TB
    FastMCP["FastMCP runtime<br/>(stdio transport, lifespan)"]
    Tools["14 @mcp.tool functions<br/>(server.py)"]
    Models["Pydantic input models<br/>+ field sets (models.py)"]
    Validators["Paper-ID validator<br/>(validators.py)"]
    Cache["TTL cache<br/>(cache.py)"]
    Fmt["Markdown formatters<br/>(formatters.py)"]
    HTTP["httpx client<br/>+ rate limit + retry/backoff<br/>(client.py)"]
    Errors["Typed exceptions<br/>(errors.py)"]
    Log["Structured JSON logger<br/>(logging_config.py)"]
  end
  S2Graph["Semantic Scholar<br/>Graph API"]
  S2Recs["Semantic Scholar<br/>Recommendations API"]

  Client <-- "stdio (JSON-RPC)" --> FastMCP
  FastMCP --> Tools
  Tools --> Models
  Tools --> Validators
  Tools --> Cache
  Tools --> HTTP
  Tools --> Fmt
  HTTP --> Errors
  HTTP --> Log
  HTTP -- "GET / POST<br/>x-api-key" --> S2Graph
  HTTP -- "GET / POST<br/>x-api-key" --> S2Recs

Module responsibilities (src/semantic_scholar_mcp/):

Module	Responsibility
`server.py`	FastMCP instance, 14 `@mcp.tool` registrations, lifespan, `main()` entry. Re-exports the helper surface for back-compat.
`client.py`	Shared `httpx.AsyncClient` singleton, per-tier rate limiter (1 req/s public, 10 req/s keyed), retry loop with exponential backoff + jitter on 429/503/timeout, HTTP→typed-exception mapping.
`models.py`	Pydantic input models per tool, `ResponseFormat` enum, the four tiered field-set constants (`PAPER_SEARCH_FIELDS`, `…_LITE`, `PAPER_BULK_SEARCH_FIELDS`, `PAPER_DETAIL_FIELDS`, `AUTHOR_FIELDS`).
`validators.py`	Pre-flight paper-ID validation. Rejects NUL bytes, `?`, `#`, path traversal; accepts the seven canonical ID formats.
`cache.py`	In-memory TTL cache (5 min, 200 entries, oldest-first eviction) for paper/author lookups within a session.
`formatters.py`	Markdown renderers for paper and author dicts, tuned for chat-surface readability.
`errors.py`	`SemanticScholarError` hierarchy: `AuthenticationError`, `RateLimitError`, `NotFoundError`, `ValidationError`, `ServerError`.
`logging_config.py`	One-JSON-per-line `StructuredFormatter` on stderr; safe to ship through any log aggregator.

Design choices worth knowing

Single httpx.AsyncClient per process. Created lazily, closed in the FastMCP lifespan teardown. Amortizes connection setup; respects keep-alive limits.
Rate limit is enforced at the client, not the API. A semaphore + last-request timestamp ensures we never exceed the per-tier interval even when the MCP host issues tool calls in parallel.
Retry is bounded and jittered. Up to MAX_RETRIES = 3, base 1 s, capped at 30 s. Honors Retry-After when present.
Errors are typed. Status codes map onto a small exception hierarchy so callers can branch on AuthenticationError vs RateLimitError vs NotFoundError instead of parsing strings.
Input validation is pre-flight. Paper IDs are checked before any outbound request; bad IDs never hit the wire.
Version is single-source. __version__ is derived from importlib.metadata.version("s2-mcp-server"), so bumping pyproject.toml is sufficient; release-please bumps the manifest, server.json (×2 paths), CITATION.cff, and .zenodo.json in lockstep on every release.

Configuration

API Key Options

You can provide your API key in two ways:

Environment Variable (recommended for persistent use):

export SEMANTIC_SCHOLAR_API_KEY="your-api-key-here"

Per-Request Parameter (overrides env var):
```
{
  "api_key": "your-api-key-here"
}
```
Caution: per-request api_key values are part of the tool-call
arguments and may be visible in MCP transcripts, client logs, and the
LLM's tool-call history depending on the client. For production use,
prefer the SEMANTIC_SCHOLAR_API_KEY environment variable. Removal of
the per-request parameter is planned for a follow-up release; see
.github/SECURITY.md for the tracked list.

Get a free API key at: https://www.semanticscholar.org/product/api

Claude Desktop Setup

Add to your Claude Desktop config file:

Windows: %APPDATA%\Claude\claude_desktop_config.json
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Linux: ~/.config/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "semantic-scholar": {
      "command": "python",
      "args": ["-m", "semantic_scholar_mcp"],
      "env": {
        "SEMANTIC_SCHOLAR_API_KEY": "your-api-key-here"
      }
    }
  }
}

Then restart Claude Desktop.

Supported ID Formats

The server accepts the following paper identifier formats:

Format	Pattern	Example
Semantic Scholar ID	40-character hex	`649def34f8be52c8b66281af98ae884c09aef38b`
DOI	`DOI:xxx`	`DOI:10.1038/s41586-021-03819-2`
ArXiv	`ARXIV:xxx`	`ARXIV:2106.15928` or `ARXIV:2106.15928v2`
PubMed	`PMID:xxx`	`PMID:32908142`
Corpus ID	`CorpusId:xxx`	`CorpusId:215416146`
ACL	`ACL:xxx`	`ACL:P19-1285`
URL	`URL:xxx`	`URL:https://arxiv.org/abs/2106.15928`

Tools Reference

1. `semantic_scholar_search_papers`

Search for academic papers with advanced filters.

Parameters:

Parameter	Type	Required	Description
`query`	string	Yes	Search query (supports AND, OR, NOT operators and "phrase search")
`year`	string	No	Year filter: `"2024"`, `"2020-2024"`, or `"2020-"`
`fields_of_study`	string[]	No	Filter by fields: `["Computer Science", "Biology"]`
`publication_types`	string[]	No	Filter by type: `["Review", "JournalArticle"]`
`open_access_only`	boolean	No	Only return open access papers (default: false)
`min_citation_count`	integer	No	Minimum citation count
`limit`	integer	No	Max results 1-100 (default: 10)
`offset`	integer	No	Pagination offset (default: 0)
`response_format`	string	No	`"markdown"` or `"json"` (default: markdown)
`api_key`	string	No	Override environment API key

Example:

Search for "transformer attention mechanism" papers from 2023 with at least 100 citations

JSON Example:

{
  "query": "transformer attention mechanism",
  "year": "2023",
  "min_citation_count": 100,
  "fields_of_study": ["Computer Science"],
  "limit": 20
}

2. `semantic_scholar_get_paper`

Get detailed information about a specific paper.

Parameters:

Parameter	Type	Required	Description
`paper_id`	string	Yes	Paper ID in any supported format
`include_citations`	boolean	No	Include citing papers (default: false)
`include_references`	boolean	No	Include referenced papers (default: false)
`citations_limit`	integer	No	Max citations to return 1-100 (default: 10)
`references_limit`	integer	No	Max references to return 1-100 (default: 10)
`response_format`	string	No	`"markdown"` or `"json"` (default: markdown)
`api_key`	string	No	Override environment API key

Example:

Get details for DOI:10.1038/s41586-021-03819-2 including its top 20 citations

JSON Example:

{
  "paper_id": "DOI:10.1038/s41586-021-03819-2",
  "include_citations": true,
  "citations_limit": 20
}

3. `semantic_scholar_search_authors`

Search for academic authors by name.

Parameters:

Parameter	Type	Required	Description
`query`	string	Yes	Author name to search
`limit`	integer	No	Max results 1-100 (default: 10)
`offset`	integer	No	Pagination offset (default: 0)
`response_format`	string	No	`"markdown"` or `"json"` (default: markdown)
`api_key`	string	No	Override environment API key

Example:

Find author "Yoshua Bengio"

JSON Example:

{
  "query": "Yoshua Bengio",
  "limit": 5
}

4. `semantic_scholar_get_author`

Get author profile with publications.

Parameters:

Parameter	Type	Required	Description
`author_id`	string	Yes	Semantic Scholar author ID
`include_papers`	boolean	No	Include publications (default: true)
`papers_limit`	integer	No	Max papers to return 1-100 (default: 20)
`response_format`	string	No	`"markdown"` or `"json"` (default: markdown)
`api_key`	string	No	Override environment API key

Example:

Get author profile for author ID 1741101 with their top 50 publications

JSON Example:

{
  "author_id": "1741101",
  "include_papers": true,
  "papers_limit": 50
}

5. `semantic_scholar_recommendations`

Get AI-powered paper recommendations based on a seed paper.

Parameters:

Parameter	Type	Required	Description
`paper_id`	string	Yes	Seed paper ID in any supported format
`limit`	integer	No	Max recommendations 1-100 (default: 10)
`response_format`	string	No	`"markdown"` or `"json"` (default: markdown)
`api_key`	string	No	Override environment API key

Example:

Get recommendations based on paper 649def34f8be52c8b66281af98ae884c09aef38b

JSON Example:

{
  "paper_id": "ARXIV:1706.03762",
  "limit": 15
}

6. `semantic_scholar_bulk_papers`

Retrieve multiple papers in a single request (max 500).

Parameters:

Parameter	Type	Required	Description
`paper_ids`	string[]	Yes	List of paper IDs (max 500)
`response_format`	string	No	`"markdown"` or `"json"` (default: json)
`api_key`	string	No	Override environment API key

Example:

Retrieve these papers: DOI:10.1038/nature12373, ARXIV:2106.15928, PMID:32908142

JSON Example:

{
  "paper_ids": [
    "DOI:10.1038/nature12373",
    "ARXIV:2106.15928",
    "PMID:32908142"
  ]
}

7. `semantic_scholar_bulk_search`

Search papers with sorting and cursor-based pagination for large result sets.
Unlike search_papers, supports a sort order and returns a token for
paging through all results.

Parameters:

Parameter	Type	Required	Description
`query`	string	Yes	Search query
`sort`	string	No	Sort order, e.g. `"citationCount:desc"`, `"publicationDate:asc"`
`token`	string	No	Continuation token from a previous bulk_search response
`year`	string	No	Year filter: `"2024"`, `"2020-2024"`, `"2020-"`
`fields_of_study`	string[]	No	Filter by fields: `["Computer Science"]`
`publication_types`	string[]	No	Filter by type: `["Review", "JournalArticle"]`
`min_citation_count`	integer	No	Minimum citation count
`limit`	integer	No	Max results per page 1-1000 (default: 100)
`response_format`	string	No	`"markdown"` or `"json"` (default: markdown)
`api_key`	string	No	Override environment API key

JSON Example:

{
  "query": "graph neural networks",
  "sort": "citationCount:desc",
  "year": "2020-2024",
  "limit": 100
}

Returns: total result count, the page of papers, and a token for the
next page (when more results exist).

8. `semantic_scholar_export_citation`

Export a citation for a paper in BibTeX format.

Parameters:

Parameter	Type	Required	Description
`paper_id`	string	Yes	Paper ID in any supported format
`format`	string	No	Citation format (currently only `"bibtex"`)
`api_key`	string	No	Override environment API key

JSON Example:

{
  "paper_id": "DOI:10.1038/s41586-021-03819-2",
  "format": "bibtex"
}

Returns: the BibTeX string for the requested paper.

9. `semantic_scholar_match_paper`

Find the single best paper matching a title string. Returns a numeric
matchScore alongside the matched paper.

Parameters:

Parameter	Type	Required	Description
`query`	string	Yes	Paper title to match (1-500 chars)
`response_format`	string	No	`"markdown"` or `"json"` (default: markdown)
`api_key`	string	No	Override environment API key

JSON Example:

{
  "query": "Attention Is All You Need"
}

Returns: the best-matching paper plus its matchScore, or "No matching
paper found." if no match.

10. `semantic_scholar_paper_authors`

Get full author profiles for a paper's authors (richer than the abbreviated
author list returned by get_paper).

Parameters:

Parameter	Type	Required	Description
`paper_id`	string	Yes	Paper ID in any supported format
`limit`	integer	No	Max authors to return 1-1000 (default: 100)
`response_format`	string	No	`"markdown"` or `"json"` (default: markdown)
`api_key`	string	No	Override environment API key

JSON Example:

{
  "paper_id": "ARXIV:1706.03762",
  "limit": 25
}

Returns: the list of full author records for the paper.

11. `semantic_scholar_author_batch`

Retrieve multiple authors in a single request (max 1000).

Parameters:

Parameter	Type	Required	Description
`author_ids`	string[]	Yes	List of author IDs (1-1000)
`response_format`	string	No	`"markdown"` or `"json"` (default: json)
`api_key`	string	No	Override environment API key

JSON Example:

{
  "author_ids": ["1741101", "40348417", "144749327"]
}

Returns: counts of requested / retrieved, the retrieved author
records, and a not_found list of IDs the API did not return.

12. `semantic_scholar_multi_recommend`

Get recommendations using multiple positive (and optional negative) example
papers.

Parameters:

Parameter	Type	Required	Description
`positive_paper_ids`	string[]	Yes	Papers to find similar results for (1-100)
`negative_paper_ids`	string[]	No	Papers to dissimilate from (0-100)
`limit`	integer	No	Max recommendations 1-500 (default: 10)
`response_format`	string	No	`"markdown"` or `"json"` (default: markdown)
`api_key`	string	No	Override environment API key

JSON Example:

{
  "positive_paper_ids": ["ARXIV:1706.03762", "ARXIV:1810.04805"],
  "negative_paper_ids": ["DOI:10.1038/nature14539"],
  "limit": 20
}

Returns: the recommended papers plus an echo of the positive/negative
seeds used.

13. `semantic_scholar_snippet_search`

Search within paper full text and return text snippets with surrounding
context. Heavily rate-limited without an API key.

Parameters:

Parameter	Type	Required	Description
`query`	string	Yes	Search query for paper text (1-500 chars)
`paper_ids`	string[]	No	Limit search to specific papers (max 100)
`year`	string	No	Year filter: `"2024"`, `"2020-2024"`, `"2020-"`
`fields_of_study`	string[]	No	Filter by fields: `["Computer Science"]`
`min_citation_count`	integer	No	Minimum citation count
`limit`	integer	No	Max results 1-100 (default: 10)
`response_format`	string	No	`"markdown"` or `"json"` (default: markdown)
`api_key`	string	No	Override environment API key

JSON Example:

{
  "query": "scaling laws for language models",
  "year": "2022-2024",
  "limit": 20
}

Returns: matching snippets, each with the source paper title, section,
and a short text excerpt.

14. `semantic_scholar_status`

Check server health and API connectivity status.

Parameters: None

Example:

Check Semantic Scholar API status

Response:

{
  "server": "semantic-scholar-mcp",
  "version": "1.3.1",
  "api_key_configured": true,
  "timestamp": "2026-04-06T12:00:00.000000+00:00",
  "api_reachable": true
}

Rate Limits

Tier	Requests/Second	How to Get
No API Key	1 req/sec	Default
API Key	10 req/sec	Sign up (free)
Academic Partner	10-100 req/sec	Apply via S2

Note: The client-side rate limiter enforces the intervals above. The upstream Semantic Scholar API may impose stricter limits during high-traffic periods.

The server automatically handles rate limiting with:

Request serialization to enforce minimum intervals
Exponential backoff retry for 429 (rate limit) and 503 (service unavailable) errors
Maximum 3 retries with jitter

Architecture

+-----------------+     +----------------------+     +-----------------+
|  Claude Desktop |---->|  semantic-scholar-mcp |---->| Semantic Scholar|
|   (MCP Client)  |<----|     (This Server)     |<----+      API        |
+-----------------+     +----------------------+     +-----------------+
        |                         |                          |
        | stdio (JSON-RPC)        | Your API Key             | HTTPS
        | Local process           | Local machine            | 200M+ papers

Where your API key goes. The MCP server runs locally on your machine and
does not store your API key on disk. When the server makes authenticated
requests, the key is sent only to api.semanticscholar.org over HTTPS as
the x-api-key header that the Semantic Scholar API requires. No telemetry
is sent to any third party. See the per-request api_key caution above for
how transcript exposure can occur when the parameter is used per-request
instead of via the environment variable.

Development

# Clone
git clone https://github.com/smaniches/semantic-scholar-mcp.git
cd semantic-scholar-mcp

# Install dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Run tests with coverage
pytest --cov=src/semantic_scholar_mcp --cov-report=term-missing

# Type checking
mypy src/

Security

API keys are never persisted to disk by the server. Prefer the
SEMANTIC_SCHOLAR_API_KEY environment variable over the per-request api_key
tool parameter (see SECURITY.md for details on the
transcript-exposure risk). All API communication uses HTTPS to
api.semanticscholar.org. See SECURITY.md for
vulnerability reporting and the v1.2.x known-limitations list.

Related MCP servers by the same author

alphafold-sovereign-mcp — Model Context Protocol server for AlphaFold DB and 13 other biomedical data sources, with a local SQLite knowledge graph (pip install --pre alphafold-sovereign-mcp).
uniprot-mcp — Model Context Protocol server for UniProt Swiss-Prot and TrEMBL (pip install uniprot-mcp-server).

License

MIT License - see LICENSE file.

Author

Santiago Maniches

Founder & CEO, TOPOLOGICA LLC
ORCID: 0009-0005-6480-1987
LinkedIn: santiago-maniches
Website: topologica.ai

Contributing

Contributions welcome! Please read our Contributing Guidelines.

Support

Issues: GitHub Issues
Discussions: GitHub Discussions
Contact: [email protected]

Built by TOPOLOGICA LLC
Advancing computational research through topological intelligence

Semantic Scholar MCP Server

Installation

Option 1: One-Line Install (Recommended)

Option 2: Claude Code

Option 3: Claude Desktop (Windows)

Option 4: Claude Desktop (macOS)

Option 5: pip / From Source

Option 6: Docker

Architecture

Configuration

API Key Options

Claude Desktop Setup

Supported ID Formats

Tools Reference

1. semantic_scholar_search_papers

2. semantic_scholar_get_paper

3. semantic_scholar_search_authors

4. semantic_scholar_get_author

5. semantic_scholar_recommendations

6. semantic_scholar_bulk_papers

7. semantic_scholar_bulk_search

8. semantic_scholar_export_citation

9. semantic_scholar_match_paper

10. semantic_scholar_paper_authors

11. semantic_scholar_author_batch

12. semantic_scholar_multi_recommend

13. semantic_scholar_snippet_search

14. semantic_scholar_status

Rate Limits

Architecture

Development

Security

Related MCP servers by the same author

License

Author

Contributing

Support

Reviews (0)

1. `semantic_scholar_search_papers`

2. `semantic_scholar_get_paper`

3. `semantic_scholar_search_authors`

4. `semantic_scholar_get_author`

5. `semantic_scholar_recommendations`

6. `semantic_scholar_bulk_papers`

7. `semantic_scholar_bulk_search`

8. `semantic_scholar_export_citation`

9. `semantic_scholar_match_paper`

10. `semantic_scholar_paper_authors`

11. `semantic_scholar_author_batch`

12. `semantic_scholar_multi_recommend`

13. `semantic_scholar_snippet_search`

14. `semantic_scholar_status`