paper-fetch — Legal Open-Access PDF Downloader

Name: paper-fetch
Author: Agents365-ai

What it does

Downloads paper PDFs from a DOI (or batch file of DOIs) via legal open-access sources
5-source fallback chain: Unpaywall → Semantic Scholar openAccessPdf → arXiv → PubMed Central OA → bioRxiv/medRxiv
Zero dependencies — pure Python standard library, no pip install needed
Auto-named output — {first_author}_{year}_{short_title}.pdf
Batch mode — pass a file of DOIs with --batch
Never touches Sci-Hub or any paywall-bypass service — if no OA copy exists, reports failure with metadata so you can go through ILL

Discipline Coverage

The skill is discipline-agnostic — it works for any field, not just life sciences or computer science. Coverage depends on whether the paper has a legal OA version, not on its subject area.

Source	Discipline scope
Unpaywall	✅ All disciplines (covers every Crossref DOI — humanities, social sciences, physics, chemistry, economics, etc.)
Semantic Scholar	✅ All disciplines (cross-domain academic graph)
arXiv	Physics, math, CS, statistics, quantitative finance, economics, EE
PubMed Central	Biomedical only
bioRxiv / medRxiv	Biology / medicine preprints only

In practice, Unpaywall + Semantic Scholar alone cover OA papers in chemistry, materials, economics, psychology, humanities, and every other field via institutional repositories, SSRN, RePEc, and publisher-hosted OA copies. arXiv/PMC/bioRxiv are additional fallbacks for their specific domains. If no legal OA copy exists anywhere, the skill reports failure honestly — it will never bypass paywalls regardless of discipline.

Multi-Platform Support

Works with all major AI coding agents that support the Agent Skills format:

Platform	Status	Details
Claude Code	✅ Full support	Native SKILL.md format
OpenClaw / ClawHub	✅ Full support	`metadata.openclaw` namespace
Hermes Agent	✅ Full support	Installable under research category
pi-mono	✅ Full support	`metadata.pimo` namespace
OpenAI Codex	✅ Full support	`agents/openai.yaml` sidecar
SkillsMP	✅ Indexed	GitHub topics configured

Comparison

vs No Skill (native agent)

Feature	Native agent	This skill
Resolve DOI to PDF	Ad-hoc web search	Deterministic 5-source chain
Unpaywall integration	No	Yes — highest OA coverage
arXiv / PMC / bioRxiv fallback	Manual	Automatic
Batch download	No	Yes — `--batch dois.txt`
Consistent filenames	No	Yes — `author_year_title.pdf`
Legal-only guarantee	None	Hard refuses paywall bypass
Dependencies	Varies	Python stdlib only

Prerequisites

Python 3.8+ (standard library only, no extra packages)
Unpaywall contact email (optional but recommended) — set once:

export [email protected]

Add it to ~/.zshrc / ~/.bashrc to persist. Without it, Unpaywall is skipped and the remaining 4 sources (Semantic Scholar, arXiv, PMC, bioRxiv/medRxiv) are still tried.

Skill Installation

Claude Code

# Global install
git clone https://github.com/Agents365-ai/paper-fetch.git ~/.claude/skills/paper-fetch

# Project-level install
git clone https://github.com/Agents365-ai/paper-fetch.git .claude/skills/paper-fetch

OpenClaw / ClawHub

clawhub install paper-fetch

# Or manual
git clone https://github.com/Agents365-ai/paper-fetch.git ~/.openclaw/skills/paper-fetch

Hermes Agent

git clone https://github.com/Agents365-ai/paper-fetch.git ~/.hermes/skills/research/paper-fetch

Or add to ~/.hermes/config.yaml:

skills:
  external_dirs:
    - ~/myskills/paper-fetch

pi-mono

git clone https://github.com/Agents365-ai/paper-fetch.git ~/.pimo/skills/paper-fetch

OpenAI Codex

# User-level
git clone https://github.com/Agents365-ai/paper-fetch.git ~/.agents/skills/paper-fetch

# Project-level
git clone https://github.com/Agents365-ai/paper-fetch.git .agents/skills/paper-fetch

SkillsMP

skills install paper-fetch

Installation paths summary

Platform	Global path	Project path
Claude Code	`~/.claude/skills/paper-fetch/`	`.claude/skills/paper-fetch/`
OpenClaw	`~/.openclaw/skills/paper-fetch/`	`skills/paper-fetch/`
Hermes Agent	`~/.hermes/skills/research/paper-fetch/`	Via `external_dirs`
pi-mono	`~/.pimo/skills/paper-fetch/`	—
OpenAI Codex	`~/.agents/skills/paper-fetch/`	`.agents/skills/paper-fetch/`
SkillsMP	N/A (installed via CLI)	N/A

Usage

Single DOI:

python scripts/fetch.py 10.1038/s41586-021-03819-2

Custom output directory:

python scripts/fetch.py 10.1038/s41586-021-03819-2 --out ~/papers

Batch mode:

cat > dois.txt <<EOF
10.1038/s41586-021-03819-2
10.1126/science.abj8754
10.1101/2023.01.01.522400
EOF

python scripts/fetch.py --batch dois.txt --out ~/papers

Dry-run (preview without downloading):

python scripts/fetch.py 10.1038/s41586-020-2649-2 --dry-run

Human-readable text output:

python scripts/fetch.py 10.1038/s41586-020-2649-2 --format text

Or just ask your agent naturally:

Download the AlphaFold2 paper PDF to my ~/papers folder

Fetch the PDF for DOI 10.1038/s41586-020-2649-2

Download these three papers: 10.1038/s41586-021-03819-2, 10.1126/science.abj8754, 10.1101/2023.01.01.522400

Check if this paper has an open-access PDF available: 10.1038/s41586-020-2649-2

Batch download all DOIs from my dois.txt file into ~/papers

Resolution Order

Unpaywall — best OA location across all publishers (highest hit rate)
Semantic Scholar — openAccessPdf field + externalIds lookup
arXiv — if the paper has an arXiv ID
PubMed Central OA subset — if the paper has a PMCID
bioRxiv / medRxiv — DOI prefix 10.1101/
Otherwise → report failure with metadata (title/authors) for ILL

Files

SKILL.md — the only required file. Loaded by all platforms.
scripts/fetch.py — the downloader (pure stdlib Python)
agents/openai.yaml — OpenAI Codex sidecar configuration
README.md — this file
README_CN.md — Chinese documentation

Known Limitations

Coverage depends on OA availability — if a paper has no legal OA copy, this skill cannot get it. That is a feature, not a bug.
Some publisher redirects return an HTML landing page instead of a PDF; the script validates the %PDF header and fails cleanly in that case
No authentication — institutional proxies (EZproxy / OpenAthens) are not supported in this version
Host allowlist — downloads are restricted to known OA provider domains; PDFs from unlisted hosts are blocked
50 MB size limit — per-PDF download cap to prevent runaway downloads

License

MIT

Support

If this skill helps your work, consider supporting the author:

WeChat Pay

Alipay

Buy Me a Coffee

Author

Agents365-ai

Bilibili: https://space.bilibili.com/441831884
GitHub: https://github.com/Agents365-ai