A Claude Code plugin that turns your AI agent into a disciplined software engineer.

Installation · How It Works · 19 Skills · Philosophy · Blog Post

Installation

In Claude Code, register the marketplace first:

/plugin marketplace add reidemeister94/development-skills

Then install the plugin:

/plugin install development-skills@development-skills

The plugin activates automatically on any coding task. No configuration needed.

Verify Installation

Start a new session and give Claude a development task. The plugin should activate automatically — you'll see it follow a structured workflow (research, plan, implement, verify, review) instead of jumping straight to code.

You can also invoke skills directly:

/brainstorming    — Evaluate approaches before committing to one
/debugging        — Systematic root-cause analysis
/create-test      — Design tests that find bugs, not just exist
/distill          — Compress verbose text while preserving facts
/commit           — Conventional commit from staged changes

Updating

/plugin update development-skills@development-skills

Optional: Regression Testing

The skill-creator plugin is required for running regression tests (/eval-regression):

/plugin install skill-creator@claude-plugins-official

development-skills in action

Why This Exists

AI agents are fast but undisciplined. 67% of developers spend more time debugging AI-generated code, which contains 1.7x more major issues and 2.74x more security vulnerabilities. Agents delete unit tests to make them pass. They generate code at 140-200 lines/min while humans comprehend at 20-40 — creating cognitive debt.

development-skills makes the good-day workflow the only workflow — research before planning, plan before coding, test before shipping, review before merging. Every time, without reminding.

As Anthropic's 2026 Agentic Coding Trends Report concluded: success comes from treating agentic development as a "workflow design problem, not a tool adoption problem."

Without vs With development-skills

How It Works

When you give Claude Code a development task with this plugin installed, it doesn't just start writing code. Instead, it follows a mandatory gated workflow:

7-Phase Development Workflow

Each phase is a gate — the agent cannot proceed until the gate conditions are met. No skipping. No combining. No "this is trivial, I'll just code it."

The 7 Phases

#	Phase	What Happens	Gate
1	Research	Explore the codebase and gather context	"RESEARCH COMPLETE"
2	Plan	Write a plan to disk, enter plan mode	User approves the plan
3	Chronicle	Document the WHY — business context, requirements, decisions	"CHRONICLE INITIATED"
4	Implement	TDD cycles with dedicated implementer subagent	"SOLUTION COMPLETE"
5	Verify	Dedicated test-verifier runs the full test suite	Evidence of passing
6	Staff Review	Two-stage code review: spec compliance, then quality	"APPROVED"
7	Finalize	Update docs, chronicle, integration options	"WORKFLOW COMPLETE"

Small tasks get a fast track. If a change touches 3 files or fewer with a single obvious approach, the plugin collapses into lightweight mode — same quality checks, no ceremony.

Key Features

Subagent Orchestration

Brainstorming Guard — Before coding, evaluates scope, reversibility, and approach clarity. If anything is ambiguous, spawns an isolated analysis agent. The default is to analyze; burden of proof is on skipping. Anti-rationalization tables counter the model's tendency to justify shortcuts. Without this guard, the agent skips analysis ~40% of the time on tasks that need it.

Subagent Architecture — Three specialized agents: Staff Reviewer (Opus, two-stage code review), Implementer (Sonnet, TDD execution), Test Verifier (Sonnet, structured pass/fail). Mirrors Anthropic's effective sub-agent patterns. Giving agents a way to verify their own work improves quality 2-3x.

Observation Masking — Verbose tool output (80%+ of context tokens) stays on disk. Implementation logs, test output, and review criteria live in files — your main conversation stays clean for decision-making.

Filesystem Persistence — Plans, chronicles, and workflow state survive context compaction. The agent resumes from any phase, even after a full context clear. Projects with persistent memory show 40% fewer errors and 55% faster completion.

Smart Parallel Implementation — For 4+ independent tasks, analyzes file-touch maps and spawns parallel agents in git worktrees — but only when proven safe via dependency analysis. Naive parallelization produced 100% unusable code; single-agent is the safe default.

Chronicles — The missing documentation layer. Code says WHAT, plans say HOW, chronicles capture WHY. Business context, decisions, and failed approaches — timestamped and browseable.

19 Skills, 5 Languages

Development Skills

Skill	Trigger	What It Does
`core-dev`	Auto (any coding task)	Workflow router — detects language, enforces brainstorming guard, dispatches
`brainstorming`	`/brainstorming`	Critical evaluation with isolated analysis agent. Two modes: full analysis, focused evaluation
`python-dev`	`/python-dev`	Python patterns — Pydantic, FastAPI, asyncpg, pytest
`java-dev`	`/java-dev`	Java patterns — Records, Streams, Spring Boot, JPA
`typescript-dev`	`/typescript-dev`	TypeScript patterns — Zod, Express, Fastify, vitest (backend/CLI only)
`frontend-dev`	`/frontend-dev`	Auto-detects React, Next.js, Raycast, Vite. Loads framework-specific patterns
`swift-dev`	`/swift-dev`	Swift patterns — SwiftUI, UIKit, Vapor, SPM
`debugging`	`/debugging`	Systematic root-cause debugging: investigate, analyze, hypothesize, fix

Specialized Skills

Skill	Trigger	What It Does
`create-test`	`/create-test`	Risk-scored test design. Explorer mode audits your codebase for dangerous untested code; targeted mode generates boundary, property-based, and invariant tests with strong assertions
`distill`	`/distill`	Hybrid semantic text compression: deterministic regex pre-processing + LLM compression + deterministic post-verification. Multilingual noise removal (EN/IT/FR/ES/DE). Measures entropy via gzip
`commit`	`/commit`	Conventional commits from staged changes
`chronicles`	Auto	Project snapshots capturing the WHY behind changes
`align-docs`	`/align-docs`	Align documentation with current project state
`eval-regression`	`/eval-regression`	Pre-commit regression testing — compares current version against last committed version
`update-precommit`	`/update-precommit`	Update `.pre-commit-config.yaml` hooks to latest versions
`update-reqs`	`/update-reqs`	Update `requirements.in` with latest PyPI versions
`update-reqs-dev`	`/update-reqs-dev`	Update `requirements-dev.in` with latest PyPI versions
`resolve-merge`	`/resolve-merge`	Systematic merge conflict resolution with numbered docs renumbering support
`best-practices`	`/best-practices <topic>`	Deep web research from authoritative sources (engineering blogs, official docs, books, GitHub projects >5k stars). Produces structured state-of-the-art report with trade-offs, decision frameworks, anti-patterns, and cited sources

Auto-Format on Save

A PostToolUse hook automatically formats files when Claude edits them:

Language	Formatter	Fallback
Python	ruff (30x faster than Black)	—
JS/TS/CSS/JSON	biome (Rust-based, 7-100x faster)	prettier
Java	google-java-format	—
Kotlin	ktfmt	ktlint
Swift	swift-format	swiftformat
HTML/YAML	prettier	—

Design Philosophy

Iron Rules — enforced at every phase, not suggested:

No positive claims without fresh verification evidence
Red/Green TDD — every implementation starts with a failing test (Kent Beck agrees)
Comment the WHY, not the WHAT
No commits without explicit user request
Every gate must be explicitly passed

Model Behavior — maximum honesty (zero accommodation), always-on critical thinking, calibrated criticism (concrete and evidence-based), planning as 90% of the work, data-validated decisions, and persistent knowledge on disk.

Architecture

skills/          19 skills (core-dev, 5 languages, brainstorming, debugging, testing, utilities)
agents/          3 subagents (implementer, staff-reviewer, test-verifier)
hooks/           Auto-format on Edit/Write (multi-language) + session context
shared/          Workflow engine with just-in-time phase loading
commands/        Feedback production/ingestion

Context is loaded progressively following Anthropic's just-in-time pattern: workflow.md always loaded (~120 lines), phase instructions loaded per-phase (~300 words each), language patterns loaded on-demand.

Context Engineering

Implements patterns from Anthropic's Context Engineering guide and validated by Manus across millions of production users:

Context Engineering: Progressive Disclosure

Progressive disclosure — phase instructions loaded just-in-time, not all at once
Observation masking — verbose output on disk, condensed summaries in conversation
Filesystem as extended context — plans, chronicles, workflow state, implementation logs
Clean subagent windows — each agent gets only the context it needs
Anti-rationalization tables — keep the model honest under pressure

Built from 60,000+ lines of production Python — FastAPI backends, legacy databases, shared environments. Every feature exists because its absence caused a real problem.

Regression Testing

30 evals, 98 assertions across 11 behavioral dimensions — a test suite for agent behavior. Powered by Anthropic's skill-creator plugin.

/eval-regression

Covers: brainstorming guard (7), smart isolation (6), anti-rationalization (4), performance review (3), workflow phases (3), implementer discipline (2), language detection, chronicle quality, turn boundaries, project directives, and AskUserQuestion avoidance. Each eval snapshots the committed version as baseline, runs the modified version, and produces a clear verdict: SAFE TO COMMIT or REGRESSIONS FOUND.

Contributing

Contributions welcome — especially new language skills (Rust, Go, Kotlin, Ruby, C#). See CONTRIBUTING.md.

Golden rule: no PR without a passing /eval-regression benchmark. Zero regressions = merge. Open an issue first to discuss.

License

MIT

If this plugin makes your AI agent more disciplined, consider giving it a star.
_{It helps others discover the project and motivates continued development.}

Read the full story · Report an issue · Contribute

development-skills