mineru-skill

skill
Guvenlik Denetimi
Basarisiz
Health Uyari
  • License — License: MIT
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Low visibility — Only 6 GitHub stars
Code Basarisiz
  • rm -rf — Recursive force deletion command in scripts/book-skill-pack.sh
  • rm -rf — Recursive force deletion command in scripts/mineru-book-to-skill.sh
Permissions Gecti
  • Permissions — No dangerous permissions requested

Bu listing icin henuz AI raporu yok.

SUMMARY

Claude Code skill for MinerU document parsing API - convert PDF/DOC/PPT/images to Markdown/JSON

README.md

MinerU Skill

mineru-skill

A Claude Code skill for parsing documents with the MinerU API

License Stars Issues Claude Code Skill MinerU v2.7.6

Convert PDF, DOC, DOCX, PPT, PPTX, and images into clean Markdown/JSON
with OCR, formula recognition, table extraction, and batch processing.


Features

Feature Description
Cloud API No GPU needed — uses mineru.net hosted service
Local API Self-hosted with mineru-api for full control
Smart Models hybrid (default), pipeline, vlm, MinerU-HTML
Rich Extraction OCR (109 languages), LaTeX formulas, cross-page tables
Batch Processing Parse up to 200 files per request
Extra Formats Export to DOCX, HTML, or LaTeX alongside Markdown
CLI Script mineru-parse.sh for quick command-line usage
Auto-Extract Download + unzip + display markdown in one step
Book Skill Packs Stage parsed books for LLM extraction into candidate agent skills

Quick Start

1. Install the Skill

cd ~/.claude/skills
git clone https://github.com/LeoLin990405/mineru-skill.git mineru

2. Set Up API Token

Get a free token at mineru.net/apiManage/token, then:

mkdir -p ~/.config/mineru
echo "YOUR_TOKEN" > ~/.config/mineru/token
chmod 600 ~/.config/mineru/token

3. Use It

In Claude Code — just ask naturally:

Parse this PDF to markdown: https://arxiv.org/pdf/2301.00001.pdf
Extract tables from report.pdf using the vlm model with OCR

Via CLI script:

# Parse from URL
./scripts/mineru-parse.sh https://example.com/paper.pdf --output ./parsed --extract

# Parse local file with VLM model
./scripts/mineru-parse.sh report.pdf --model vlm --ocr --output ./out

# Extra output formats
./scripts/mineru-parse.sh slides.pptx --format docx --format html

# Turn a book into an LLM-ready skill extraction workspace
./scripts/mineru-book-to-skill.sh book.pdf --title "My Book" --output ./workspaces --cloud-ok

CLI Reference

mineru-parse.sh <url_or_file> [options]

Options

Option Description Default
--model <m> Model version hybrid
--ocr Enable OCR off
--no-formula Disable formula recognition on
--no-table Disable table recognition on
--output <dir> Download results to directory -
--extract Auto-extract zip, show markdown off
--pages <range> Page ranges, e.g. "1-5,8" all
--format <fmt> Extra format: docx/html/latex -
--callback <url> Webhook for async notification -
--data-id <id> Custom tracking identifier -
--no-print-md Do not print extracted markdown to stdout off
--manifest <file> Write local parse manifest; requires --output -
--quiet Suppress progress output off

Environment Variables

Variable Default Description
MINERU_TOKEN_FILE ~/.config/mineru/token Token file path
MINERU_API_BASE https://mineru.net/api/v4 API base URL
MINERU_POLL_INTERVAL 5 Poll interval (seconds)
MINERU_MAX_POLL 360 Max poll attempts

Models

Model Best For Speed Notes
hybrid General use Medium Default since v2.7.0, recommended
pipeline CPU-only environments Fast No GPU required
vlm Complex layouts, scanned docs Slower Needs 10GB+ VRAM
MinerU-HTML Preserving HTML structure Medium Web content

API Limits (Cloud)

Item Limit
File size 200 MB
Pages per file 600
Daily priority pages 2,000 / account
Batch upload 200 files / request
Token validity 90 days

Book-to-Skill Workflow

First iteration support is implemented as a staging workflow. It prepares a
parsed book for a large language model, but it does not call an LLM and does
not install or enable generated skills automatically.

# Local files are uploaded to the MinerU cloud API; --cloud-ok is required.
./scripts/mineru-book-to-skill.sh ~/Books/example.pdf \
  --title "Example Book" \
  --output ./book-workspaces \
  --cloud-ok

The wrapper creates:

book-workspaces/
└── books/
    └── example-book/
        ├── README.md
        ├── source/
        ├── mineru/
        │   ├── parse_manifest.json
        │   ├── *_result.zip
        │   └── <extracted markdown files>
        └── analysis/
            └── book-skill-pack/
                ├── README.md
                ├── manifest.json
                ├── LLM_EXTRACTION_PROMPT.md
                ├── BOOK_SKILL_INDEX.md
                ├── MANAGE_SKILLS.md
                ├── source-markdown/
                └── skills/

Give LLM_EXTRACTION_PROMPT.md and source-markdown/ to a model. The model
should fill BOOK_SKILL_INDEX.md and create one candidate file per extracted
skill under skills/.

Expected model output is book-scoped:

  • what the book can help the agent do
  • when the agent should reference the book
  • candidate workflows, checklists, diagnostics, decision rules, and prompt patterns
  • source anchors and confidence
  • which candidates should stay book-scoped vs be promoted to managed skills

Privacy Boundary

This repository uses the MinerU cloud API by default. Local book files are
uploaded to MinerU during parsing. Use a local MinerU workflow for private or
sensitive books. Generated packs do not store API tokens, authorization headers,
or remote result URLs.

Manage Skills Boundary

Book packs are candidate workspaces. Review the generated skills/*.md files
before promoting anything into a managed skill repository. Use your local skill
manager only after review, for example:

skills enable <promoted-skill-name>

Roadmap Notes

Next iterations should make extraction more structural:

  • split the parsed book by chapters or section headings
  • extract skill candidates from each chapter first
  • synthesize the whole-book summary after chapter-level extraction
  • keep generated skills packaged by the book's structure, so an agent can refer
    to the book as a coherent source package

The same pipeline should later generalize beyond books. Future inputs can
include video courses, papers, and other long-form text. Before extraction, the
system should classify the text type, such as book, course transcript, paper,
manual, or article collection, then choose the right segmentation and skill
extraction strategy.

Examples

See the examples/ directory for:

Project Structure

mineru-skill/
├── SKILL.md                 # Claude Code skill definition (full API reference)
├── scripts/
│   ├── mineru-parse.sh      # CLI helper script
│   ├── mineru-book-to-skill.sh # Book parsing + skill-pack staging wrapper
│   └── book-skill-pack.sh   # Build a skill extraction pack from Markdown
├── examples/
│   ├── parse_single.sh      # Single URL parsing example
│   ├── parse_local.sh       # Local file parsing example
│   ├── parse_batch.py       # Batch processing example (Python)
│   └── book_to_skill.sh     # Book-to-skill workspace example
├── .github/
│   ├── ISSUE_TEMPLATE/      # Bug report & feature request templates
│   └── PULL_REQUEST_TEMPLATE.md
├── CONTRIBUTING.md
├── CHANGELOG.md
├── LICENSE                  # MIT
└── README.md

Documentation

Full API reference including all endpoints, request/response formats, error codes, and Python/curl examples is in SKILL.md.

Contributing

Contributions are welcome! Please read the Contributing Guide before submitting a PR.

Related Projects

  • MinerU — The document parsing engine by OpenDataLab
  • Claude Code — Anthropic's CLI for Claude

License

MIT © 2026 LeoLin990405

Yorumlar (0)

Sonuc bulunamadi