claude-code-insights
Claude Code 的 CLAUDE.md、Skills 與 Subagents 學習資源與最佳實踐整理。 本倉庫彙整自 Anthropic 官方文件、社群文章與熱門 GitHub 儲存庫,聚焦於實務上可直接採用的設計模式、工作流程與範例。內容主要是學習整理與資源 整編,並非原創研究。
Claude Code Insights
English | 繁體中文
A best practices guide for Claude Code CLAUDE.md, Skills, and Subagents.
This repository compiles and organizes content primarily from Anthropic's official documentation, community articles, and popular GitHub repositories — credit goes to the original authors. This is a curated collection of what I've learned along the way, not original research.
Who Is This For
Developers ranging from Claude Code beginners to those looking to master Skills and Subagent design patterns.
- Beginners: Start with the CLAUDE.md guide to build a solid foundation
- Advanced: The Skills and Subagent guides cover design patterns and architectural strategies
Contents
| File | Description | Audience |
|---|---|---|
| claude-md-best-practices.md | CLAUDE.md Best Practices — why not to use /init, the three-layer architecture model, Hooks over directives, anti-patterns |
Beginners |
| skills-best-practices.md | Skills Best Practices — file format, loading mechanics, content writing principles, advanced patterns, design templates | Advanced |
| subagent-best-practices.md | Subagent Best Practices — the black-box problem and solutions, result persistence, tool scoping strategies, architectural patterns | Advanced |
The three guides cross-reference each other. Recommended reading order: CLAUDE.md → Skills → Subagent.
Practical Examples
Subagent Examples
| Example | Description | Difficulty |
|---|---|---|
| examples/security-reviewer | Dual-verification security audit — read-only /security:review (Semgrep + Codex cross-validation, confidence scoring) + opt-in /security:fix (convergence-hardened Fix-Verify Loop with falsifiable predictions, tiered rollback, hypothesis ledger). Strict review/fix boundary: review is always read-only; fix requires explicit user opt-in and runs in main conversation. Implements Harness Engineering methodology |
Advanced |
Hook Examples
| Example | Description | Difficulty |
|---|---|---|
| examples/npm-supply-chain-defense | npm supply chain three-layer defense — .npmrc script blocking + PreToolUse Hook checks (registry, OSV.dev, version resolution, CLI syntax validation) + Semgrep supply chain scanning. Includes 42 regression tests |
Advanced |
Skills Examples
The following Skills are adapted from mattpocock/skills (implementing concepts like design trees and TDD vertical slices), reworked according to this repository's best practices guide: templates extracted to templates/, reference materials moved to reference/, and Gotchas sections added.
| Example | Description | Use Case |
|---|---|---|
| examples/grill-me | Design Interrogation — traverses every branch of a decision tree, resolving design decision dependencies one by one | Pre-coding design stress test |
| examples/write-prd | Write PRD — turns design decisions or rough ideas into a structured requirements document, bridging grill-me and prd-to-plan | Requirements authoring and structuring |
| examples/tdd | TDD Workflow — red-green-refactor vertical slices with test examples, mock guidelines, and deep module design reference | Feature development and bug fixes |
| examples/prd-to-plan | PRD to Implementation Plan — breaks requirements into tracer bullet vertical slices, outputs to ./plans/, with optional Codex review for high-risk plans |
Requirements decomposition and phase planning |
| examples/write-a-skill | Skill Builder Meta-Skill — content type decisions, invocation control, security configuration, Gotchas iteration loop. Includes eval workflow reference | Creating new Skills |
| examples/skill-eval-toolkit | Skill Eval Toolkit — eval-driven testing, quantitative benchmarking, blind A/B comparison, description trigger optimization, and SKILL.md body autopilot keep/revert loop | Validating and optimizing existing Skills |
Solo Development Workflow: /grill-me (interrogate the design) → /write-prd (write the PRD) → /prd-to-plan (break into phases, optionally review with Codex) → /tdd (implement one by one)
Skill Development Workflow: /write-a-skill (author the skill) → /skill-eval-toolkit (evaluate and optimize)
write-prd — PRD Authoring Skill
Fills the gap between /grill-me and /prd-to-plan — turns design decisions or rough ideas into a structured PRD:
- One-question-per-turn structured interview (target user → success state → non-goals)
- Enforces P0/P1/P2 priority tiers, References citing decision records,
docs/prds/output convention - Stays at product / system-contract level — no implementation details leak (httpOnly cookies, SDK pinning, etc.)
- Delegated/Eval Mode — distinguishes draft artifacts from canonical PRD files
- Includes 4 functional eval cases + 12 trigger tests (validated through 2 iterations with skill-eval-toolkit, 90% pass rate)
You: "Write a PRD based on docs/decisions/login-redesign.md"
Claude: (loads write-prd, reads decision record, identifies gaps,
produces PRD draft with user stories + traceability, asks to confirm)
write-a-skill — Skill Authoring Guide
Use when you want to create a new skill from scratch. Covers the full authoring lifecycle:
- Content type decision (Reference vs Task) and invocation control (
disable-model-invocation,context: fork, etc.) - Frontmatter schema, progressive disclosure (metadata → body → bundled resources)
- Description writing — trigger-oriented keywords, not feature summaries
- Security checklist and review process
- Gotchas iteration loop — the feedback mechanism that makes skills more accurate over time
You: "I want to create a skill that generates API documentation from OpenAPI specs"
Claude: (loads write-a-skill, interviews you, drafts SKILL.md, runs smoke tests)
skill-eval-toolkit — Eval-Driven Testing and Optimization
Use when you have an existing skill and want to measure or improve it. Provides a structured eval loop with 4 specialized subagents:
| Subagent | Role |
|---|---|
| grader | Evaluate assertions against outputs, critique eval quality |
| comparator | Blind A/B comparison — scores two outputs without knowing which is which |
| comparison-analyzer | Post-hoc analysis — unblinds results, identifies why the winner won |
| benchmark-analyzer | Surface patterns in benchmark data that aggregate stats hide |
The workflow: create test prompts → run with-skill and baseline in parallel → grade → aggregate benchmarks → interactive viewer for human review → improve → repeat. Also includes automated description trigger optimization (train/test split, iterative improvement) and an eval-driven body autopilot loop for small SKILL.md mutations.
You: "Evaluate my json-diff skill and tell me if it actually adds value"
Claude: (loads skill-eval-toolkit, creates test cases, spawns parallel runs,
grades outputs, launches viewer, shows you the results)
When to use which: If the question is "how should I structure this skill?" →
write-a-skill. If the question is "is this skill actually working well?" →skill-eval-toolkit. Most skills start with the former and graduate to the latter when you need quantitative rigor.
Gotchas Are the Soul of a Skill: The strongest signal in any Skill isn't the tutorial — it's the pitfalls the team has hit. Every time a Skill execution encounters an unexpected failure, write the failure pattern back into Gotchas — this feedback loop makes the Skill more accurate over time. See skills-best-practices.md § 4.3 for details.
Note: These examples can be copied directly into
.claude/skills/for use. We recommend reading both guides first to understand the design rationale and adapt them to your needs.
Quick Overview
CLAUDE.md Guide Highlights
- Why you shouldn't use
/initfor auto-generation (with supporting research data) - The right approach from scratch: start with nothing, add rules only when problems arise
- Three-layer architecture model: Enforcement Layer / High-Frequency Recall Layer / On-Demand Reference Layer
- Using Hooks instead of CLAUDE.md directives (determinism > suggestions)
- Five major anti-patterns and maintenance practices
Skills Guide Highlights
- What a Skill is and how it differs from CLAUDE.md
- Complete Frontmatter field reference
- Auto-trigger vs manual invocation control methods
- Skill vs Subagent vs CLAUDE.md decision matrix
- Security guidelines and iteration methodology
Subagent Guide Highlights
- When to use a Subagent (Anthropic's official stance: most scenarios don't need one)
- Four solutions for black-box and one-shot problems
- Research-type and review-type Agent design templates
- Hub-and-Spoke architectural pattern
- The Early Victory Problem and mitigation strategies
License
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found