ppt-master

agent
SUMMARY

AI generates editable, beautifully designed PPTX from any document — no design skills needed | 15 examples, 229 pages

README.md

PPT Master — AI generates natively editable PPTX from any document

Version
License: MIT
GitHub stars
AtomGit stars

English | 中文

Drop in a PDF, DOCX, URL, or Markdown file — AI generates natively editable PowerPoint presentations with real shapes, not images. Every text box, chart, and graphic is a real PowerPoint object you can click and edit. Supports PPT 16:9, social media cards, marketing posters, and 10+ other formats.

🔥 NEW: Native Editable PPTX — Generated presentations now contain real PowerPoint shapes (DrawingML) by default — text, charts, and graphics are directly editable in PowerPoint without any extra steps. No more "Convert to Shape"!

💡 Architecture Update: The project uses a Skill-based architecture:

  1. Lower Token Consumption & Model Dependency: Significantly reduced token consumption. Now, even non-Opus models can generate decent results.
  2. High Extensibility: The skills folder is organized according to the Agent Skills standard, with each subdirectory being a fully self-contained Skill. It can be natively invoked by dropping it into the skills directory of compatible AI clients (e.g., .claude/skills/ or ~/.claude/skills/ for Claude Code; global skills directory referenced via .agent/workflows/ for Antigravity; .github/skills/ or ~/.copilot/skills/ for GitHub Copilot).
  3. Stable Fallback:Although the previous multi-platform architecture consumes more tokens, it has been more extensively tested. If you experience instability with the current version, you can always fall back to the last release of the old architecture: v1.3.0.

Online Examples: GitHub Pages Preview — See actual generated results

🎨 Design Philosophy — AI as Your Designer, Not Your Finisher

The generated PPTX is a design draft, not a finished product. Think of it like an architect's rendering: the AI handles visual design, layout, and content structure — delivering a high-quality starting point. For truly polished results, expect to do your own finishing work in PowerPoint: swapping shapes, refining charts, adjusting colors, replacing placeholder graphics with native objects. The goal is to eliminate 90% of the blank-page work, not to replace human judgment in the final mile. Don't expect one AI pass to do everything — that's not how good presentations are made.

A tool's ceiling is your ceiling. PPT Master amplifies the skills you already have — if you have a strong sense of design and content, it helps you execute faster. If you don't know what a great presentation looks like, the tool won't know either. The output quality is ultimately a reflection of your own taste and judgment.


🎴 Featured Examples

Example Library: examples/ · 15 projects · 229 pages

Category Project Pages Features
🏢 Consulting Style Attachment in Psychotherapy 32 Top consulting style, largest scale example
Building Effective AI Agents 15 Anthropic engineering blog, AI Agent architecture
Chongqing Regional Report 20 Regional fiscal analysis
Ganzi Prefecture Economic Analysis 17 Government fiscal analysis, Tibetan cultural elements
🎨 General Flexible Debug Six-Step Method 10 Dark tech style
Chongqing University Thesis Format 11 Academic standards guide
Creative Style I Ching Qian Hexagram Study 20 I Ching aesthetics, Yin-Yang design
Diamond Sutra Chapter 1 Study 15 Zen academic, ink wash whitespace
Git Introduction Guide 10 Pixel retro game style

📖 View Complete Examples Documentation


🏗️ System Architecture

User Input (PDF/DOCX/URL/Markdown)
    ↓
[Source Content Conversion] → pdf_to_md.py / doc_to_md.py / web_to_md.py
    ↓
[Create Project] → project_manager.py init <project_name> --format <format>
    ↓
[Template Option] A) Use existing template B) No template
    ↓
[Need New Template?] → Use /create-template workflow separately
    ↓
[Strategist] - Eight Confirmations & Design Specifications
    ↓
[Image_Generator] (When AI generation is selected)
    ↓
[Executor] - Two-Phase Generation
    ├── Visual Construction Phase: Generate all SVG pages → svg_output/
    └── Logic Construction Phase: Generate complete speaker notes → notes/total.md
    ↓
[Post-processing] → total_md_split.py (split notes) → finalize_svg.py → svg_to_pptx.py
    ↓
Output: Two files are generated automatically:
    ├── presentation.pptx        ← Native shapes (DrawingML) — recommended for editing & delivery
    └── presentation_svg.pptx   ← SVG reference version — pixel-perfect visual reference; use
                                    "Convert to Shape" in PowerPoint to unlock individual elements

📚 Documentation Navigation

Document Description
🧭 AGENTS.md Repository-level entry overview for general AI agents
📖 SKILL.md Canonical ppt-master workflow and rules
📐 Canvas Formats PPT, Xiaohongshu (RED), WeChat Moments, and 10+ formats
🖼️ Image Embedding Guide SVG image embedding best practices
📊 Chart Template Library Standardized chart templates
🔧 Role Definitions Role definitions and technical references
🛠️ Toolset Script index and high-frequency commands
Conversion Docs PDF / DOCX / web conversion tools
Project Docs Project init, validation, and indexing
SVG Pipeline Docs Finalize, validate, notes, and PPTX export
Image Docs Image generation and image analysis tools
Troubleshooting Docs Validation, preview, export, and dependency issues
💼 Examples Index 15 projects, 229 SVG pages of examples

🚀 Quick Start

1. Configure Environment

Python Environment (Required)

This project requires Python 3.8+ for running PDF conversion, SVG post-processing, PPTX export, and other tools.

Platform Recommended Installation
macOS Use Homebrew: brew install python
Windows Download installer from Python Official Website
Linux Use package manager: sudo apt install python3 python3-pip (Ubuntu/Debian)

💡 Verify Installation: Run python3 --version to confirm version ≥ 3.8

Node.js Environment (Optional)

If you need to use the web_to_md.cjs tool (for converting web pages from WeChat and other high-security sites), install Node.js.

Platform Recommended Installation
macOS Use Homebrew: brew install node
Windows Download LTS version from Node.js Official Website
Linux Use NodeSource: curl -fsSL https://deb.nodesource.com/setup_lts.x | sudo -E bash - && sudo apt-get install -y nodejs

💡 Verify Installation: Run node --version to confirm version ≥ 18

Pandoc (Optional)

If you need to use the doc_to_md.py tool (for converting DOCX, EPUB, LaTeX, and other document formats to Markdown), install Pandoc.

Platform Recommended Installation
macOS Use Homebrew: brew install pandoc
Windows Download installer from Pandoc Official Website
Linux Use package manager: sudo apt install pandoc (Ubuntu/Debian)

💡 Verify Installation: Run pandoc --version to confirm it is installed

2. Clone Repository and Install Dependencies

git clone https://github.com/hugohe3/ppt-master.git
cd ppt-master
pip install -r requirements.txt

If you encounter permission issues, use pip install --user -r requirements.txt or install in a virtual environment.

3. Open AI Editor

Recommended AI editors:

Tool Rating Description
Claude Code ⭐⭐⭐ Highly Recommended! Anthropic official CLI, native Opus support, largest context window
Codebuddy IDE ⭐⭐ Great Chinese AI IDE, good support for local models like Kimi 2.5 and MiniMax 2.7
Cursor ⭐⭐ Mainstream AI editor, great experience but relatively expensive
VS Code + Copilot ⭐⭐ Microsoft official solution, cost-effective, but limited context window (200k, 35% reserved for output)
Antigravity Free but very limited quota and unstable. Alternative only.

4. Start Creating

Open the AI chat panel in your editor and describe what content you want to create:

User: I have a Q3 quarterly report that needs to be made into a PPT

AI: Sure. First we'll confirm whether to use a template; after that Strategist will
   continue with the eight confirmations and generate the design spec.
   [Template Option] [Recommended] B) No template
   [Strategist] 1. Canvas format: [Recommended] PPT 16:9
   [Strategist] 2. Page count: [Recommended] 8-10 pages
   ...

💡 Model Recommendation: Claude Opus works best, but most mainstream models today (like Kimi 2.5 and MiniMax 2.7, tested via Codebuddy IDE) can also generate decent results with only minor gaps in layout details. Due to the instability of Opus on some IDEs (like Antigravity), trying other stable AI clients is recommended.

📝 Post-Export Editing: The default exported PPTX (.pptx) contains native PowerPoint shapes — text, graphics, and colors are directly editable, no extra steps needed. A second SVG reference file (_svg.pptx) is also generated; for that version, select the content in PowerPoint and use "Convert to Shape" to edit. Requires Office 2016 or later.

💡 AI Lost Context? Ask the AI to read skills/ppt-master/SKILL.md first; use AGENTS.md as the repository-level entry overview.

5. AI Image Generation (Optional)

The image_gen.py tool supports multiple providers, but the recommended set is intentionally kept small.

  • Core recommended: gemini, openai, qwen, zhipu, volcengine
  • Extended: stability, bfl (FLUX), ideogram
  • Experimental: siliconflow, fal, replicate

To inspect the current support tiers in the CLI:

python3 skills/ppt-master/scripts/image_gen.py --list-backends

Image generation accepts configuration from either of these sources:

  1. Current process environment variables
  2. Project-root .env file as a fallback

Precedence:

  • Current process environment wins
  • .env only fills values that are not already present

One rule is mandatory:

  • IMAGE_BACKEND must be set explicitly

The tool does not guess a provider from API keys, and it does not default to Gemini.

Option A: Configure .env

cp .env.example .env

Edit the .env file with your configuration:

IMAGE_BACKEND=gemini
GEMINI_API_KEY=your-api-key
GEMINI_MODEL=gemini-3.1-flash-image-preview

.env is already in .gitignore and will not be committed to the repository, so your keys stay safe.

Option B: Use current process environment variables

export IMAGE_BACKEND=gemini
export GEMINI_API_KEY=your-api-key
export GEMINI_MODEL=gemini-3.1-flash-image-preview
python3 skills/ppt-master/scripts/image_gen.py "abstract tech background"

This is useful for CI, containers, secret managers, and one-off local runs.

Important:

  • The script reads the current process environment, not your shell startup files in the abstract
  • If a tool launches a non-interactive shell and your ~/.bashrc or ~/.zshrc is not sourced, those exports may not be visible
  • In that situation, .env is usually the more stable local choice

💡 Provider-specific config: Put credentials and overrides in provider namespaces such as GEMINI_API_KEY, GEMINI_MODEL, GEMINI_BASE_URL, OPENAI_API_KEY, QWEN_MODEL, ZHIPU_MODEL, VOLCENGINE_API_KEY, and REPLICATE_API_TOKEN.

💡 No global image credentials: IMAGE_API_KEY, IMAGE_MODEL, and IMAGE_BASE_URL are intentionally not supported. Use provider-specific variables only.

💡 Multi-provider setup: You can keep multiple providers in one .env or environment, but IMAGE_BACKEND must explicitly select the active one.

Example:

IMAGE_BACKEND=zhipu

GEMINI_API_KEY=your-gemini-key
GEMINI_MODEL=gemini-3.1-flash-image-preview

ZHIPU_API_KEY=your-zhipu-key
ZHIPU_MODEL=glm-image

Switch providers by changing IMAGE_BACKEND only.

💡 Recommendation: Prefer the core backends unless you have a clear reason to use an extended or experimental provider.

💡 AI Image Generation Tip: For AI-generated images, we recommend generating them in Gemini and selecting Download full size for higher resolution. Gemini images have a star watermark in the bottom right corner, which can be removed using gemini-watermark-remover or this project's skills/ppt-master/scripts/gemini_watermark_remover.py.


📁 Project Structure

ppt-master/
├── skills/
│   └── ppt-master/                 # Main skill source
│       ├── SKILL.md                #   Main entry: workflow definition
│       ├── workflows/              #   Workflow entry files
│       ├── references/             #   Role definitions and specs
│       ├── scripts/                #   Tool scripts
│       └── templates/              #   Layouts, charts, icons
├── examples/                       # Example projects
├── projects/                       # User project workspace
├── AGENTS.md                       # General AI agent entry
└── CLAUDE.md                       # Dedicated Claude Code CLI entry

🛠️ Common Commands

# Initialize project
python3 skills/ppt-master/scripts/project_manager.py init <project_name> --format ppt169

# Archive source materials into the project folder
python3 skills/ppt-master/scripts/project_manager.py import-sources <project_path> <source_file_or_url...>

# Note: files outside the workspace are copied by default; files already in the workspace are moved into sources/

# PDF to Markdown
python3 skills/ppt-master/scripts/pdf_to_md.py <PDF_file>

# DOCX / Office documents to Markdown (requires pandoc)
python3 skills/ppt-master/scripts/doc_to_md.py <DOCX_file>

# Post-processing (run in order)
python3 skills/ppt-master/scripts/total_md_split.py <project_path>
python3 skills/ppt-master/scripts/finalize_svg.py <project_path>
python3 skills/ppt-master/scripts/svg_to_pptx.py <project_path> -s final
# Default: generates two files — native shapes (.pptx) + SVG reference (_svg.pptx)
# Use --only native  to skip SVG reference version
# Use --only legacy  to only generate SVG image version
# Default transition: fade (0.5s). Disable with: -t none

📖 For script docs, start with Tools Usage Guide, then jump to Conversion, Project, SVG Pipeline, Image, or Troubleshooting


🎨 Create Custom Template

Want to turn a PPT you love into a reusable template for PPT Master? Here's how:

Step 1 — Prepare Screenshots

Take screenshots of the key page types from your reference PPT — cover page, table of contents, chapter divider, content page, and closing page. Save them as images in a single folder with clear, descriptive filenames (e.g., cover.png, toc.png, chapter.png, content.png, closing.png).

Step 2 — Let AI Create the Template

Use an AI coding agent (Claude Code, Codex, etc.) and ask it to use the PPT Master /create-template workflow to convert your screenshots into a template. In your prompt, provide:

  • The template's English name and Chinese name
  • The intended use case (e.g., government reports, premium consulting, product launches)
  • The desired visual effects and color palette to apply when this template is used
  • Whether to enable AI image generation

Step 3 — Wait for the Result

The AI agent will handle the rest — analyzing your screenshots, building the layout definitions, and registering the template so it appears as a selectable option in the PPT Master workflow.

💡 Tip: The more specific you are about the style and use case, the better the generated template will match your expectations.


❓ FAQ

Q: Can I edit the generated presentations?

Yes! The default export (.pptx) produces native PowerPoint shapes — all text, graphics, and colors are directly editable in PowerPoint without any conversion. An SVG reference version (_svg.pptx) is also generated; for that file, select the content and use "Convert to Shape" to unlock editing. Requires Office 2016 or later.

Q: What's the difference between the three Executors?
  • Executor_General: General scenarios, flexible layout
  • Executor_Consultant: General consulting, data visualization
  • Executor_Consultant_Top: Top consulting (MBB level), 5 core techniques
Q: Are the charts in the generated PPTX editable?

Charts are rendered as custom-designed SVG graphics converted to native PowerPoint shapes — not Excel-driven chart objects. This gives them a polished, high-fidelity appearance that often looks better than default PowerPoint charts. However, the underlying data is not editable via PowerPoint's chart editor. If you need a live, data-driven chart (e.g., one you can update by editing a spreadsheet), you will need to manually replace it with a native PowerPoint chart after export.

📖 For more questions, see SKILL.md and AGENTS.md


🔧 Technical Design

The pipeline: AI generates SVG → post-processing converts to DrawingML (PPTX).

The full flow breaks into three stages:

Stage 1 — Content Understanding & Design Planning
Source documents (PDF/DOCX/URL/Markdown) are converted to structured text. The Strategist role analyzes the content, plans the slide structure, and confirms the visual style, producing a complete design specification.

Stage 2 — AI Visual Generation
The Executor role generates each slide as an SVG file. The output of this stage is a design draft, not a finished product.

Stage 3 — Engineering Conversion
Post-processing scripts convert SVG to DrawingML. Every shape becomes a real native PowerPoint object — clickable, editable, recolorable — not an embedded image.


Why SVG?

SVG sits at the center of this pipeline. The choice was made by elimination.

Direct DrawingML generation seems most direct — skip the intermediate format, have AI output PowerPoint's underlying XML. But DrawingML is extremely verbose; a simple rounded rectangle requires dozens of lines of nested XML. AI has far less training data for it than SVG, output is unreliable, and debugging is nearly impossible by eye.

HTML/CSS is one of the formats AI knows best. But HTML and PowerPoint have fundamentally different world views. HTML describes a document — headings, paragraphs, lists — where element positions are determined by content flow. PowerPoint describes a canvas — every element is an independent, absolutely positioned object with no flow and no context. This isn't just a layout calculation problem; it's a structural mismatch. Even if you solved the browser layout engine problem (what Chromium does in millions of lines of code), an HTML <table> still has no natural mapping to a set of independent shapes on a slide.

WMF/EMF (Windows Metafile) is Microsoft's own native vector graphics format and shares direct ancestry with DrawingML — the conversion loss would be minimal. But AI has essentially no training data for it, so this path is dead on arrival. Notably, even Microsoft's own format loses to SVG here.

SVG as embedded images is the simplest path — render each slide as an image and embed it. But this destroys editability entirely: shapes become pixels, text cannot be selected, colors cannot be changed. No different from a screenshot.

SVG wins because it shares the same world view as DrawingML: both are absolute-coordinate 2D vector graphics formats built around the same concepts:

SVG DrawingML
<path d="..."> <a:custGeom>
<rect rx="..."> <a:prstGeom prst="roundRect">
<circle> / <ellipse> <a:prstGeom prst="ellipse">
transform="translate/scale/rotate" <a:xfrm>
linearGradient / radialGradient <a:gradFill>
fill-opacity / stroke-opacity <a:alpha>

The conversion is a translation between two dialects of the same idea — not a format mismatch.

SVG is also the only format that simultaneously satisfies every role in the pipeline: AI can reliably generate it, humans can preview and debug it in any browser, and scripts can precisely convert it — all before a single line of DrawingML is written.


🤝 Contributing

Contributions are welcome!

  1. Fork this repository
  2. Create your branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

Contribution Areas: 🎨 Design templates · 📊 Chart components · 📝 Documentation · 🐛 Bug reports · 💡 Feature suggestions


📄 License

This project is licensed under the MIT License.

🙏 Acknowledgments

  • SVG Repo - Open source icon library
  • Robin Williams - CRAP design principles
  • McKinsey, Boston Consulting, Bain - Design inspiration

📮 Contact


🌟 Star History

If this project helps you, please give it a ⭐ Star!

Star History Chart

☕ Sponsor

If this project saves you time, consider buying me a coffee to keep the tokens burning!

GitHub Sponsors

Alipay / 支付宝

Alipay QR Code

Made with ❤️ by Hugo He

⬆ Back to Top

Yorumlar (0)

Sonuc bulunamadi