Mck-ppt-design-skill

agent
SUMMARY

Consulting firm-style PowerPoint design system for AI agents. 70 layout patterns, flat design, python-pptx. 麦麸风格PPT设计系统。

README.md

MCK PPT Design Skill

AI-native PowerPoint design system — 70 layouts · BLOCK_ARC chart engine · post-generation QA pipeline · icon library · Python runtime

License
Copyright

Copyright © 2024-2026 Kaku Li. Licensed under Apache 2.0. See NOTICE for details.

License
Python
python-pptx
GitHub stars

English · 中文说明


Community

WeChat Group / 微信交流群

WeChat QR Code
扫描二维码加入社群

Discord Server


Discord

Click above to join our global community


⚡ v1.x → v2.0 — The Real Shift: GPU → CPU

v2.0 is not just about saving tokens on chart-heavy decks. The fundamental change is moving ~80% of the compute from GPU (LLM inference) to CPU (deterministic Python execution).

In v1.x, every pixel of every slide was generated by the AI model — coordinates, colors, spacing, all produced token-by-token through GPU inference. v2.0 turns this inside out: the AI makes high-level decisions ("use a donut chart, 3 segments"), then CPU-side Python functions handle the 80% of work that never needed a language model in the first place.

The Core Problem in v1.x

v1.x was a pure-GPU architecture. The entire SKILL.md (6,100 lines of design specs) lived in the AI's context window. For every slide, the model had to:

  1. Read the layout spec from context (~500 tokens)
  2. Compute exact pixel coordinates, font sizes, color codes (~200 tokens of "thinking")
  3. Output raw add_shape() / add_rect() calls with hardcoded numbers (~800 tokens per slide)

For charts, this was catastrophic: a donut chart required the AI to output 2,800 individual add_rect() calls — each one a GPU-computed token. A single complex chart could cost ~2 minutes of inference time, produce a 5 MB file, and burn 40,000+ output tokens for a 30-slide deck.

The AI was doing work that a for loop could do. That's the real problem.

How v2.0 Fixes It: Functionalization

The key insight: most PPT generation is deterministic computation, not language understanding. v2.0 extracts that deterministic work into Python functions that run on CPU:

# v1.x: GPU computes every coordinate (200+ tokens per chart)
for angle in range(0, 360):
    add_rect(slide, x + cos(angle)*r, y + sin(angle)*r, ...)  # Each line = GPU tokens

# v2.0: GPU decides WHAT to build, CPU handles HOW
eng.donut_chart(title='Revenue', segments=[('A', 45, NAVY), ('B', 35, BLUE)])
# ↑ AI outputs ~20 tokens → CPU executes 2,745 lines of deterministic Python

What moved from GPU to CPU:

  • Coordinate math — pixel positions, spacing calculations, dynamic sizing
  • Shape renderingBLOCK_ARC arcs replace rect-block stacking (1 shape vs 2,800)
  • XML cleanup_clean_shape(), full_cleanup(), CJK font injection
  • Layout logic — column widths, row heights, overflow prevention
  • Quality assurance — the entire QA + auto-fix pipeline [v2.3]

What stays on GPU:

  • Content decisions — which layout to use, what text to write, storyline flow
  • Design judgment — emphasis, hierarchy, visual storytelling
  • Context understanding — adapting to user intent, tone, audience

v1.x vs v2.0 — Technical Comparison

v1.x (Pure GPU) v2.0 (GPU + CPU)
Compute split ~100% GPU (all tokens) ~20% GPU (decisions) + ~80% CPU (execution)
Chart rendering add_rect() block stacking (100–2,800 shapes/chart) BLOCK_ARC native arcs (3–4 shapes/chart)
Code generation AI writes raw add_shape() / coordinate math per slide AI calls eng.donut_chart(), eng.cover() etc. — 70 high-level methods
Rounds per 30-slide deck 10–15 (trial-and-error) 3–4 (deterministic)
Output tokens per deck 40,000–60,000 9,000–12,000
Chart generation time ~2 min (GPU inference) <1 sec (CPU execution)
File size (chart-heavy) 2–5 MB 0.5–1 MB
File corruption defense Basic XML cleanup Three-layer defense (p:style, shadow, 3D sanitization)
CJK handling Manual font setting (GPU) Automatic set_ea_font() on all CJK text runs (CPU)
Architecture Single-tier (SKILL.md only) Three-tier (SKILL.md + Python engine + post-processing)

v2.0 Three-Tier Architecture

┌─────────────────────────────────────────────────────────┐
│  Tier 1: SKILL.md (Design Specification)                │
│  ├── 70 layout patterns with exact coordinates          │
│  ├── Color system + typography hierarchy                │
│  ├── 9 production guard rails                           │
│  └── BLOCK_ARC chart rendering spec                     │
├─────────────────────────────────────────────────────────┤
│  Tier 2: mck_ppt/ (Python Runtime Engine)      [NEW]    │
│  ├── engine.py — 70 high-level layout methods           │
│  ├── core.py — Drawing primitives + XML cleanup         │
│  ├── constants.py — Colors, typography, grid constants  │
│  └── __init__.py — Clean public API                     │
├─────────────────────────────────────────────────────────┤
│  Tier 3: Review + Auto-fix Pipeline           [v2.3]    │
│  ├── review.py — Dual QA + AutoFixPipeline              │
│  │   ├── NarrativeReviewer (content density, lang mix)  │
│  │   ├── AutoFixPipeline (priority-chain overflow fix)  │
│  │   └── Peer font harmonization (same-level uniform)   │
│  ├── qa.py — Layout QA (overflow, overlap, collision)   │
│  │   ├── text_overflow + body_overflow                  │
│  │   ├── text_line_collision (text vs separator lines)  │
│  │   ├── chart_legend_overflow (legend right bound)     │
│  │   └── peer_font_inconsistency (same-Y alignment)    │
│  ├── deck_builder.py — Storyline orchestrator  [v2.3.2] │
│  │   ├── build(storyline) → auto QA → cleanup           │
│  │   └── storylines/ — reusable theme templates          │
│  └── Gate: 0 ERROR = PASS, otherwise iterate & fix      │
├─────────────────────────────────────────────────────────┤
│  Tier 4: Post-Processing Pipeline                       │
│  ├── Three-layer file corruption defense                │
│  ├── Full XML sanitization (p:style, shadow, 3D)        │
│  └── CJK font injection (KaiTi for East Asian text)     │
└─────────────────────────────────────────────────────────┘

Tier 1 tells the AI what to build — every layout has pixel-perfect coordinates, every color has a hex code, every edge case has a documented solution.

Tier 2 [NEW] is a complete Python library. Instead of the AI writing add_shape() from scratch, it calls eng.cover(), eng.toc(), eng.donut_chart() — one method per layout pattern, 2,745 lines of production-tested code.

Tier 3 [v2.3] is the post-generation quality gate. Every generated .pptx is automatically reviewed for text overflow, shape collision, and peer font inconsistency. The AutoFixPipeline iterates through a priority chain (remove redundancy → compress sentences → restructure → micro-adjust font size) until 0 ERROR, then harmonizes peer font groups. No manual QA needed.

Tier 4 automatically prevents the #1 cause of "file needs repair" errors in AI-generated PPTs.


🛡️ Production Guard Rails

13 rules hard-won from 50+ production generations:

# Rule What It Prevents
1 Never use connectors File corruption from connector p:style
2 Always call _clean_shape() Shadow/3D artifacts leaking into slides
3 Run full_cleanup() after save Residual theme effects
4 Set set_ea_font() on CJK text Chinese characters rendering as boxes
5 Use add_hline() not add_line() Connector-based lines causing repair prompts
6 Validate spacing before save Overlapping text boxes
7 Check overflow on long content Text truncation in fixed-height boxes
8 Dynamic sizing for variable-count 3 items vs 7 items need different spacing
9 Mandatory BLOCK_ARC for circular charts Rect-block bloat (v1.x legacy problem)
10 Peer font consistency check [v2.3] Same-row shapes with different font sizes after autofix
11 Text-line collision check [v2.3] Text overlapping separator lines in dense layouts
12 Post-generation QA gate [v2.3] 0 ERROR mandatory before delivery — no silent defects
13 Chart legend overflow check [v2.3.2] Legend/label elements extending beyond content area right boundary

🔍 v2.3 — Post-Generation Review + Auto-fix Pipeline

v2.3 adds a mandatory quality gate that runs after every PPT generation. The core philosophy: generate first, converge later — QA doesn't pass, don't deliver.

The Problem v2.3 Solves

MckEngine's 70 layout methods each handle "put content into shapes", but nobody checked whether the content actually fits. A title 1 character too long → text overflow. An autofix that shrinks font per-shape → peer inconsistency (same row, 5 different font sizes). These defects are invisible in code but obvious when you open the PPT.

Four-Stage Pipeline

Generate → Dual QA → Auto-fix (iterate) → Peer Harmonize → Final Gate

Stage 1: Generate — MckEngine runs normally. No content trimming, no prediction.

Stage 2: Dual QA — Layout QA (overflow, overlap, collision, whitespace, fonts) + Narrative QA (text density, title length, language mix) run in parallel. Read-only — detect only, don't modify.

Stage 3: Auto-fix — For each text overflow ERROR, try fixes in strict priority order:

  1. Remove redundancy (weak hedging, filler phrases)
  2. Compress sentences ("因为A所以B" → "A→B")
  3. Restructure (trim excess semicolon-separated clauses)
  4. Font micro-adjust (shrink 1pt/round, floor: title≥20pt, body≥11pt, footnote≥9pt)

Never changes layout — only text content and font sizes. Iterates until 0 ERROR or max rounds.

Stage 4: Peer Harmonize + Gate — Unify same-Y-position peer groups to min(sizes). Final QA: 0 ERROR = ✅ PASS.

Usage

from mck_ppt import MckEngine
from mck_ppt.review import autofix

eng = MckEngine(total_slides=3)
eng.cover(title='My Title', subtitle='Sub')
# ... add slides ...
eng.save('output/deck.pptx')

# One line — review + fix + gate
result = autofix('output/deck.pptx', max_rounds=5)
result.print_summary()  # ✅ PASS or ❌ FAIL

New QA Rules in v2.3

Rule Severity What It Catches
peer_font_inconsistency ERROR Same-row shapes with different font sizes (e.g. 18/12/17/14/13pt after per-shape autofix)
text_line_collision ERROR/WARN Text content overlapping or nearly touching separator lines
density WARNING Text exceeding character-per-box-height limits
title_long WARNING Action titles exceeding 45 characters

Key Bug Fix: _estimate_text_height

Fixed a 2-line bug that caused ~27% height overestimation: the function only checked run.font.size (which returns None for inherited fonts), now falls back to para.font.size. This single fix eliminated massive false-positive overflow reports.


🧬 v3.0 Vision — A Self-Evolving Design System

ETA: 1–2 weeks. Currently in active development. The goal for v3.0 is to transform this skill from a static tool into a self-evolving digital life form — one that learns, adapts, and improves its own design capabilities autonomously.

The Problem: The Human Bottleneck

Today, the skill author (human) is the bottleneck. Every new layout, every formatting fix, every edge case requires manual intervention:

  • A new PPT style appears → author must manually encode it into SKILL.md or engine.py
  • A layout has subtle spacing issues → author must eyeball, debug, adjust, re-test
  • Content doesn't match the right template → author must define matching rules

The author spends 80% of their time on fitting and fine-tuning — adjusting pixel coordinates, testing text overflow, iterating on formatting details. This is exactly the kind of work that should be automated.

The Solution: Adaptation Layer

v3.0 introduces a new Adaptation Layer that sits between the human author and the AI builder. This layer automates the repetitive design-refinement work that currently requires human iteration:

┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│   Author (Human)                                                │
│   ├── Defines design philosophy and brand guidelines            │
│   ├── Sets quality standards and aesthetic direction             │
│   └── Reviews and approves new learned patterns                 │
│                                                                 │
├ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
│                                                                 │
│   Adaptation Layer (NEW — v3.0)                    [COMING]     │
│   ├── Auto-learns new layout templates from examples            │
│   ├── Self-optimizes spacing, sizing, overflow handling         │
│   ├── Iteratively fits content to templates (no human loop)     │
│   └── Evolves design patterns based on production feedback      │
│                                                                 │
├ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
│                                                                 │
│   AI Builder (MckEngine + SKILL.md)                             │
│   ├── Executes layout methods (70 patterns)                     │
│   ├── Renders charts, tables, diagrams                          │
│   └── Applies QA + auto-fix pipeline                            │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

What the Adaptation Layer will do:

  1. Auto-learn templates — Feed it a reference PPT, it reverse-engineers the layout into a reusable engine method. No manual coordinate transcription.
  2. Self-optimize formatting — The repetitive cycle of "generate → eyeball → tweak spacing → regenerate" becomes an automated loop. The layer adjusts parameters until the output converges.
  3. Content-template matching — Given content and a library of templates, the layer intelligently selects and adapts the best layout — not just exact match, but creative recombination.
  4. Production feedback loop — Every human correction feeds back into the layer's understanding, making future generations more accurate.

With this capability, the precision and usability of existing templates will improve exponentially — not linearly, because each improvement compounds across all future generations.

Three Open Challenges

Building a self-evolving design system is hard. Three fundamental problems remain unsolved:

1. The PPTX Stack Was Built for Humans, Not AI

The constraint isn't PowerPoint the product — it's the entire stack underneath.

The python-pptx library is limited. It exposes a fraction of what PPTX XML can do. Many visual effects (morph transitions, SmartArt, advanced gradients) simply have no API. We constantly work around library limitations with raw XML manipulation — which is fragile and hard to maintain.

The PPTX XML format itself is human-oriented. Its architecture — deeply nested XML, theme inheritance, implicit style cascading — was designed for a human clicking buttons in a GUI, not for an AI generating programmatic output. Every "simple" operation (draw a line, set a font) requires navigating layers of XML namespaces, style overrides, and undocumented fallback behaviors. The 13 guard rails in this project exist precisely because the format fights AI at every turn.

The implication: PPTX may not be the long-term endgame for AI-generated presentations. We're actively iterating on HTML/CSS-based outputs (see Mck HTML Design) — a format that is natively programmatic, infinitely flexible, and designed for exactly this kind of generation. The future likely involves format migration, not format optimization.

2. Content × Template Matching

Given arbitrary content and a library of 70+ templates, how do you select the right layout? This is not a simple classification problem. The same bullet list might work as a vertical_steps, numbered_list_panel, or four_column depending on the content density, hierarchy, and visual intent. Deep matching between semantic structure and visual structure requires understanding that goes beyond pattern matching — it needs design intuition. This is a deep dive we're actively exploring.

3. Design Creativity

The current system is excellent at reproducing consultant-grade design within established patterns. But it cannot create — it cannot look at a brand's visual identity and invent a new layout that feels native to that brand. Humans have an aesthetic intuition that generates novel compositions from abstract style principles. Teaching a system to do this — to generate design, not just execute design specs — is the ultimate frontier. Style-conditioned generative design is where we believe this needs to go.


pip install python-pptx lxml

# Option 1: ClawHub (for AI agents)
npx clawhub@latest install mck-ppt-design

# Option 2: Manual
mkdir -p ~/.claude/skills/mck-ppt-design
cp SKILL.md ~/.claude/skills/mck-ppt-design/

# Option 3: Python engine directly
pip install -e .  # or copy mck_ppt/ to your project
from mck_ppt import MckEngine

eng = MckEngine(total_slides=30)
eng.cover(title='Q1 2026 Strategy Review', subtitle='Board Presentation')
eng.toc(items=[('1', 'Market Overview', 'Current landscape'), ...])
eng.donut_chart(title='Revenue Mix', segments=[('Product A', 45, NAVY), ...])
eng.save('output/deck.pptx')

Compatibility

AI Agent Status Install Method
Claude (Anthropic) ✅ Supported ClawHub or manual SKILL.md
Cursor ✅ Supported Add as project rule
Codebuddy ✅ Supported Load as skill
GPT / ChatGPT ✅ Works Paste SKILL.md as system prompt
Any LLM ✅ Universal Feed SKILL.md as context

Sample Output

Cover Page Content Page Table Page
Cover Content Table
4-Column Layout Color System Summary Page
4-Column Colors Summary

📁 Project Structure

├── SKILL.md                 # Design specification (290KB, 6100 lines)
├── mck_ppt/                 # Python runtime engine (180KB)
│   ├── __init__.py          # Public API (v2.3.2)
│   ├── engine.py            # 70 layout methods (2,359 lines)
│   ├── core.py              # Drawing primitives + XML cleanup (295 lines)
│   ├── constants.py         # Colors, typography, grid (78 lines)
│   ├── qa.py                # Layout QA engine (820 lines)     [Enhanced v2.3.2]
│   ├── review.py            # Review + auto-fix pipeline       [NEW v2.3]
│   ├── deck_builder.py      # Storyline-driven deck generator  [NEW v2.3.2]
│   ├── cover_image.py       # AI cover image generation        [v2.2]
│   └── storylines/          # Pre-built storyline templates     [NEW v2.3.2]
│       ├── __init__.py
│       └── ai_enterprise.py # 33-slide AI enterprise demo (Chinese)
├── assets/
│   └── icons/               # Pre-built PNG icons (200×200px)  [v2.0.5]
│       ├── icon_person_bust.png
│       ├── icon_shield_check.png
│       ├── icon_people_group.png
│       ├── icon_factory_gear.png
│       ├── icon_circuit_chip.png
│       └── icon_ai_brain.png
├── CHANGELOG.md
├── examples/
│   ├── minimal_example.py
│   ├── staircase_civilization.py    [v2.0.5]
│   └── requirements.txt
├── scripts/
│   ├── minimal_example.py
│   ├── generate_icons.py            [v2.0.5]
│   └── requirements.txt
└── references/
    ├── color-palette.md
    └── layout-catalog.md

📊 Version History

Version Date Highlights
v2.3.2 2026-03-25 DeckBuilder: storyline-driven deck generator (deck_builder.py) — accepts storyline list, auto-dispatches to MckEngine methods, built-in QA validation; stacked_bar fix: adaptive legend spacing prevents right-side overflow, chart area repositioned for visual balance; new QA rule chart_legend_overflow (detects legend/label exceeding content area, excludes page numbers); storylines/ai_enterprise.py: 33-slide Chinese AI enterprise applications demo using 20+ layout types
v2.3.1 2026-03-24 Dynamic row height for numbered_list_panel (fills panel height evenly, eliminates blank space); new QA rule text_line_collision (detects text overlapping separator lines with horizontal overlap validation)
v2.3.0 2026-03-24 Post-generation review + auto-fix pipeline: review.py with NarrativeReviewer, AutoFixPipeline (priority-chain: redundancy → compress → restructure → font adjust), peer font harmonization; fix _estimate_text_height paragraph-level font inheritance bug (27% overestimate); new QA rule peer_font_inconsistency; gate: 0 ERROR = PASS. Tested: 14 errors → 0, score 17 → 86
v2.2.0 2026-03-23 AI cover image pipeline via Tencent Hunyuan 2.0 + rembg; eng.cover(..., cover_image='auto'); cover text area widened for image mode; donut chart updated to a true thin-ring geometry with larger inner hole; matrix_2x2 bottom judgment bar spacing fixed to avoid axis overlap
v2.0.5 2026-03-21 Unified release: #14→#71, v2.1 SKILL.md rewrite, PNG icon support for #15, icon library (6 icons), narrative detail_rows
v2.0.4 2026-03-19 New table_insight() layout (#71), retire three_pillar (#14)
v2.0.2 2026-03-19 Adaptive row height for data_table / vertical_steps (overflow prevention)
v2.0.1 2026-03-19 before_after template rewrite — white editorial layout with structured data
v2.0 2026-03-19 BLOCK_ARC chart engine, Python runtime engine, three-tier architecture
v1.10.x 2026-03-15 Channel delivery, dynamic sizing
v1.9 2026-03-12 Production guard rails (9 rules)
v1.8 2026-03-10 Layout expansion: 39 → 70 patterns
v1.7 2026-03-08 Data charts: grouped/stacked/horizontal bar
v1.0 2026-03-02 Initial release

中文说明

点击展开中文文档

v2.0 更新说明

v2.0 的核心变化不仅仅是降低 token 消耗。真正的本质是:将约 80% 的算力从 GPU(大模型推理)迁移到 CPU(确定性 Python 执行)。

v1.x 中,每一个像素、每一个坐标、每一个颜色值都由 AI 模型通过 GPU 推理逐 token 输出。v2.0 把这个模式翻转过来:AI 只做高层决策("用甜甜圈图,3 个段"),然后由 CPU 端的 Python 函数处理那 80% 根本不需要语言模型的工作

怎么做到的?函数化(Functionalization)。把原本存在于 SKILL.md 中的大量参数、坐标、渲染逻辑,从 AI 需要逐行输出的 token,变成了 CPU 可以直接执行的 Python 函数。一个甜甜圈图从 2,800 个 add_rect() 调用变成了 1 行 eng.donut_chart()

v2.0 仍在持续验证中——如果你的生产工作依赖 v1.x,建议稳妥升级。 遇到问题请在微信群或 Discord 反馈。

v3.0 愿景:自我进化的数字设计生命体

预计 1-2 周内发布。 正在积极开发中。

v3.0 的目标是让这个 Skill 从一个静态工具进化为一个能够自我进化的数字生命体

当前的瓶颈在于:每一个新模板、每一次格式微调,都需要人类作者手动介入。作者 80% 的时间花在反复拟合 PPT、调整细微格式 —— 这恰恰是应该被自动化的工作。

v3.0 将在作者(人类)和 AI Builder 之间增加一个适应层(Adaptation Layer)

  • 自动学习新模板 —— 给它一个参考 PPT,它自动逆向工程为可复用的引擎方法
  • 自我优化格式 —— "生成→目测→微调→重新生成"的循环变成自动化循环
  • 内容-模板智能匹配 —— 给定内容和模板库,智能选择和适配最佳布局
  • 生产反馈闭环 —— 每一次人工修正都反馈给适应层,让未来生成更精准

有了这个能力,现有模板的精度和可用性将指数级提升

三个待深入的方向

  1. PPTX 技术栈是为人设计的,不是为 AI 设计的 —— 问题不在 PPT 这个产品本身,而在于底层:python-pptx 库只暴露了 PPTX XML 能力的一小部分,大量视觉效果没有 API;而 PPTX 的 XML 格式本身(深度嵌套、主题继承、隐式样式级联)是为人类点击 GUI 按钮设计的,不适合 AI 程序化输出。我们的 13 条 guard rails 就是因为这个格式在处处对抗 AI。未来可能更多在 HTML 或其他格式上迭代。
  2. 内容×模板深度匹配 —— 同一段内容可能适合 vertical_stepsnumbered_list_panelfour_column,取决于信息密度、层级和视觉意图。语义结构和视觉结构的深度匹配需要进一步 deep dive。
  3. 设计创造力 —— 当前系统擅长在既定模式内复现咨询级设计,但无法基于一个品牌的 style 自己创造新布局。风格条件化的生成式设计是终极前沿。

v1.x 的问题

v1.x 用几百个 add_rect() 小方块堆叠渲染甜甜圈/饼图/仪表盘。一个复杂甜甜圈可能产生 2,800 个形状5 MB 文件~2 分钟生成时间。AI 需要逐个输出每个方块的坐标。

v2.0 怎么解决的

用 PowerPoint 原生 BLOCK_ARC 形状替代方块堆叠。每个图表段现在是 1 个形状而非几百个。

技术对比

v1.x v2.0
图表渲染 add_rect() 堆叠(100–2,800 形状/图表) BLOCK_ARC 原生弧(3–4 形状/图表)
代码生成 AI 手写 add_shape() + 坐标计算 AI 调用 eng.donut_chart() 等 70 个高级方法
30 页 PPT 交互轮数 10–15 轮 3–4 轮
输出 tokens 40,000–60,000 9,000–12,000
图表生成时间 ~2 分钟 <1 秒
文件大小(图表密集) 2–5 MB 0.5–1 MB
架构 单层(仅 SKILL.md) 三层(SKILL.md + Python 引擎 + 后处理)

快速上手

pip install python-pptx lxml
npx clawhub@latest install mck-ppt-design

Apache 2.0 · © 2026 likaku · GitHub · Issues

Reviews (0)

No results found