Book Capture

Capture books from Kindle, Apple Books, or PDF — then OCR and generate structured Markdown.

A Claude Code plugin that captures book pages as screenshots, extracts text via OCR, and generates thematically organized Markdown documents. Built for Obsidian knowledge vaults but works with any Markdown-based system.

What It Does

Captures every page of a book as screenshots (Mac Kindle, Apple Books, Kindle Cloud Reader, or PDF)
Extracts text via macOS Vision OCR + Claude Code agents for low-confidence pages
Generates 8-14 thematically organized Markdown files with rich formatting (tables, blockquotes, cross-references)
Creates a hub file with frontmatter and wikilinks to all topic files

The entire pipeline runs locally with no external API keys — OCR and content generation use Claude Code's built-in capabilities.

Supported Platforms

Platform	How It Works	Best For
Mac Kindle	CGWindowList + screencapture + Page Down	Kindle purchases
Apple Books	CGWindowList + screencapture + arrow keys	Apple Books purchases
Kindle Cloud Reader	Playwright browser automation	When desktop app unavailable
PDF	Poppler pdftoppm conversion	Scanned/image-based PDFs

Installation

Install from GitHub (recommended)

In Claude Code, run:

/plugins

Navigate to the Marketplaces tab
Select Add marketplace and enter: masterleopold/book-capture
Navigate to the Discover tab
Find book-capture and install it

After installation, the commands are available in every Claude Code session.

Scope	Effect
user (default)	Available in all your projects
project	Shared with your team via `.claude/settings.json`

Try It (one-time)

git clone https://github.com/masterleopold/book-capture.git
claude --plugin-dir ./book-capture

First-Time Setup

After installing, run the setup script to install Node.js dependencies and compile the Vision OCR binary:

bash ~/.claude/plugins/cache/book-capture-marketplace/book-capture/*/scripts/setup.sh

Or the plugin will auto-detect and prompt you on first use.

Requirements

macOS (required for screencapture and Vision OCR)
Claude Code CLI installed and authenticated
Node.js 20+
Xcode Command Line Tools (xcode-select --install)
Accessibility permission for Terminal/Claude Code (System Settings > Privacy & Security > Accessibility)
Poppler (PDF only): brew install poppler

Commands

Source-Specific (recommended)

Command	Platform	Description
`/book-capture:kindle`	Mac Kindle	Capture from Amazon Kindle desktop app
`/book-capture:books`	Apple Books	Capture from Apple Books app
`/book-capture:cloud`	Kindle Cloud Reader	Capture via browser (Playwright)
`/book-capture:pdf`	PDF file	Capture from scanned/image-based PDF

Pipeline Steps

Command	Description
`/book-capture:capture`	Full pipeline with platform selection prompt
`/book-capture:ocr`	OCR only on existing page captures
`/book-capture:generate`	Markdown generation from existing OCR text

Quick Start

/book-capture:kindle B0883TQ3ZN

Claude Code will:

Ask for book title, author, category, and location
Remind you to open the book in Kindle to the first page
Capture all pages (auto-stops at end of book)
Run Vision OCR + agent re-reading for low-confidence pages
Analyze content and create 8-14 thematic topic files
Generate a hub file with wikilinks

How It Works

Screenshot Capture

Each platform uses macOS-native tools:

CGWindowList (via inline Swift) to find the app window ID
screencapture to capture individual window frames
AppleScript to control page navigation (Page Down for Kindle, arrow keys for Books)
Duplicate detection to auto-stop at end of book (3 consecutive identical pages)

OCR Pipeline

macOS Vision OCR (fast, local) processes all pages with confidence scoring
Pages below the confidence threshold are re-read by Claude Code agents using multimodal image reading
Results are merged into raw_text.json

Note: macOS Vision cannot read vertical Japanese text (tategaki). For vertical text books, all pages are re-read by agents.

Markdown Generation

Claude Code analyzes the full text and identifies 8-14 thematic categories (organized by information type, not original chapter order)
Parallel agents generate detailed topic files (300-600 lines each) with:
- Genre-specific structure (business, technical, humanities, science, narrative)
- Tables, blockquotes, bold key terms, cross-references
- [[wikilinks]] between sibling topics
A hub file is created with frontmatter and links to all topics

Output Structure

Books/entries/BookTitle.md              # Hub file with frontmatter
Knowledge/Category/BookTitle/
  01_Theme_Name.md                      # Topic file (300-600 lines)
  02_Theme_Name.md
  ...
  10_Theme_Name.md

Hub File Example

---
tags:
  - source/book
  - type/framework
  - theme/fundraising
Category: "Startup"
Rating: ""
author: "Author Name"
Location: "Knowledge/Startup/BookTitle"
Chapters: 10
Language: "EN"
URL: "https://www.amazon.co.jp/dp/B0883TQ3ZN"
---

# Book Title

Book summary (2-3 sentences).

## Topics

- [[01_Theme Name]] - Description
- [[02_Theme Name]] - Description
...

Configuration

Per-project settings via .claude/book-capture.local.md:

---
vault_root: /path/to/obsidian/vault
captures_dir: Books/files/book-captures
entries_dir: Books/entries
default_source: kindle
default_language: JP
default_dpi: 200
---

If no settings file exists, the plugin auto-detects Obsidian vaults by looking for .obsidian/ in the current directory or parents.

Troubleshooting

Issue	Solution
Kindle window not found	Ensure book is open and visible in Amazon Kindle app
Accessibility denied	System Settings > Privacy & Security > Accessibility > add Terminal
Vision OCR compilation fails	`xcode-select --install`
Vision OCR misses vertical text	Expected for tategaki — agents handle these pages
Pages don't advance	Kindle uses Page Down key; restart the app if stuck
pdftoppm not found	`brew install poppler`
PDF is encrypted	`qpdf --decrypt input.pdf output.pdf`
npm packages not installed	Run `scripts/setup.sh`

Architecture

book-capture/
  .claude-plugin/
    plugin.json           # Plugin manifest
    marketplace.json      # Marketplace metadata
  commands/               # 7 slash commands
    kindle.md             # /book-capture:kindle
    books.md              # /book-capture:books
    cloud.md              # /book-capture:cloud
    pdf.md                # /book-capture:pdf
    capture.md            # /book-capture:capture (generic)
    ocr.md                # /book-capture:ocr
    generate.md           # /book-capture:generate
  agents/
    ocr-reader.md         # Multimodal OCR re-reader
    content-writer.md     # Thematic content generator
  skills/
    book-capture/
      SKILL.md            # Auto-activating skill
  scripts/
    capture-kindle-mac.mjs
    capture-books-app.mjs
    kindle-capture.mjs    # Playwright Cloud Reader
    capture-pdf.mjs
    extract-text.mjs      # Vision OCR
    generate-markdown.mjs # Direct API fallback
    book-capture-utils.mjs
    vision-ocr.swift      # macOS Vision CLI
    setup.sh              # Dependency installer
    package.json
  templates/
    settings-template.md

License

MIT