cladding

mcp
Security Audit
Fail
Health Warn
  • License — License: MIT
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Low visibility — Only 5 GitHub stars
Code Fail
  • execSync — Synchronous shell command execution in package.json
  • fs module — File system access in package.json
Permissions Pass
  • Permissions — No dangerous permissions requested

No AI report is available for this listing yet.

SUMMARY

Ending vibe coding. The integrity layer for AI-coded software — spec-driven, drift-aware, iron-clad. Reference implementation of the Ironclad standard, with a plugin and MCP server for Claude Code, Codex CLI, Gemini CLI, and Cursor.

README.md

cladding — Unified Governance for AI-Coupled Engineering

English · 한국어

cladding

Unified Governance for AI-Coupled Engineering.
AI-generated code, held to the same bar as human code.

ironclad spec tests coverage license

Reference implementation of the Ironclad standard. 27 detectors and a 13-stage gate verify, on every commit, that the code your AI assistant wrote still matches the spec.

Vanilla AI coding
2/8
traps caught · 25%
cladding
8/8
traps caught · 100%
Same spec · same model · event-sourcing store benchmark

Why

The why fades after 3 months

The reason an AI assistant wrote code a certain way doesn't survive in the code alone.

spec/features/*.yaml becomes the permanent record of why.

AI context survives time — six months later, the AI reconstructs intent straight from the spec (new hires get the same entry point).

AI gives a different answer each time

The same spec produces code with inconsistent patterns and structure.

→ The spec becomes the fixed reference against which every commit is checked.

Enterprise-ready consistency — code style and patterns stay aligned across teams and PRs.

AI hallucination

Generated code calls APIs, functions, or options that don't exist.

→ 27 detectors and a 13-stage gate block hallucinated code on every commit.

Production incidents prevented up front — CI auto-rejects hallucinated code before it merges.

What you get

How a vanilla AI coding environment and a cladding environment behave when the same situation comes up.

SituationVanilla AI codingcladding
Code drifts from specfixed if a reviewer noticesauto-blocked on every commit
Two devs build the same feature in parallelmerge conflictshash-based IDs route to separate files → 0 conflicts
Who verifies AI-written code?the AI that wrote it (risky)a separate reviewer agent — duties split
Switching AI tools (Claude → Cursor)reconfigure per toolone spec → mirrored across 4 hosts
Spec authoritythe AI reinterprets it each timethe sealed spec is the single source of truth

The hero's 8/8 vs 2/8 is an early benchmark (details) · larger-scale measurements are in progress.

How it works

Spec → Code → Tests runs as a single cycle — the spec captures the why, Iron Law verifies the implementation, and Drift Detection blocks anything that no longer matches.

Spec → Code → Tests as a single cycle — one feature's lifecycle

1. Spec — SSoT, single source of intent

The spec is where the why (what we're building and why) lives. A 4-tier (A/B/C/D) Single Source of Truth — intent on top, implementation below.

Tier Role Who edits Authority
A — Spec intent (what to build) humans only sealed · LLMs cannot edit
B — Design design (how to build it) humans freely checked against A
C — Derived implementation (code · tests) LLMs and humans regenerated by reading the code
D — Audit audit log (what actually happened) append-only immutable

A outranks B — if code and spec disagree, the code is wrong. The spec is sealed because changing the why shakes everything downstream, so LLMs are kept out.

Sharded · multi-dev safespec/features/<slug>-<hash6>.yaml puts each feature in its own file with a 6-char hash ID (e.g. F-5f6b45). Two devs creating new features at the same time land in different files with different IDs — zero merge conflicts. Details: Hash-based feature IDs.

4-tier SSoT — A(Spec) → B(Design) → C(Derived) → D(Audit), A outranks B

2. Code — Iron Law (required) gate

Every change has to clear all 13 stages — typically called from CI, a git pre-push hook, or manual clad check. Each stage ships with its own unit tests.

13-stage Iron Law gate — every change must clear static(6) · test(2) · e2e(3) · evidence(2) wherever clad check runs (CI / git hook / manual)
Stage What it checks
1.1 Type · 1.2 Lint type errors · code style
1.3 Drift spec ↔ code mismatches across 27 detectors
1.4 Commit · 1.5 Arch · 1.6 Secret clean working tree · architecture invariants (forbidden imports, etc.) · leaked API keys
2.1 Unit · 2.2 Cov unit tests pass · project coverage threshold
3.1 Smoke · 3.2 Perf · 3.3 Visual end-to-end critical paths · performance budgets · visual regression
4.1 Audit · 4.2 UAT every AC (acceptance criteria) has at least one piece of evidence · every status=done feature has at least one piece of evidence

3. Tests — 27 drift detectors

Seven categories of mismatch across spec · code · test, all caught automatically. Full catalog: src/stages/detectors/README.md.

CategoryWhat it catchesCountRepresentative detectors
spec ↔ code driftsomething in the spec missing from code, or in code with nothing in the spec6UNMAPPED_ARTIFACT, MISSING_IMPLEMENTATION, AC_DRIFT
code ↔ testcode without tests · coverage falling below threshold6MISSING_TESTS, COVERAGE_DROP, HARDCODED_SECRET
spec ↔ testan AC in the spec that no test actually verifies4UNTESTED_AC, STATUS_DRIFT, STALE_EVIDENCE
spec maintenancespec hygiene — slug collisions, ID duplicates4SLUG_CONFLICT, ID_COLLISION
environment integritybuild environment and meta-file integrity3HARNESS_INTEGRITY, META_INTEGRITY
architecture · capabilitycode that breaks the architecture or capability shape declared in the spec2ARCHITECTURE_FROM_SPEC, CAPABILITIES_FEATURE_MAPPING
governance · policycode that breaks an `ai_hints` policy (e.g. forbidden patterns)2AI_HINTS_FORBIDDEN_PATTERN, ABSENCE_OF_GOVERNANCE

4. Cycle — one feature's lifecycle

The 4 steps that wrap Spec → Code → Test into a single cycle. Merge if drift is 0, block otherwise.

One feature's lifecycle — Define → Sync → Implement → Verify, merge if drift=0 / block otherwise

Multi-Agent Workflow

cladding is a 5-agent system working in concert. Each agent has a clear role under CQS (Command-Query Separation — the agents that do are kept apart from the agents that verify), so no agent can sign off on its own work. This is the foundation that maps cleanly to compliance regimes (EU AI Act · K-AI Framework · SOX).

5 personas with CQS — orchestrator dispatches, librarian/specialist/reviewer act, observability watches metrics

Ecosystem

cladding sits at the intersection of three existing categories.

Ecosystem Venn — cladding sits at the intersection of SDD · Runners · Multi-agent Governance

How cladding differs from the neighbors

  • Spec Kit · OpenSpec · Tessl · Kiro help you write a good spec. cladding goes further — it verifies on every commit that the code still matches that spec.
  • BMAD · ChatDev · Claude Code Agent Teams are about splitting work across multiple AI agents. cladding's 5 agents take that further by tying spec, code, and audit log into the same loop.
  • tdd-guard forces test-first development. That's roughly what the Unit · Coverage stages do inside cladding's 13-stage gate.
  • OpenHands · Cline · Aider · Goose are runners — they tell the AI to write code. cladding is the governance layer that verifies and controls what those runners produce.

cladding's edge is the combination — it folds the strongest parts of all four categories into one verification loop.

Install

Two steps: install the infrastructure, then create the project spec.

Step 1 — Install the infrastructure

Pick the route that fits how you work — both land in the same place:

(a) npm — for terminal / CI users

npm install -g cladding   # install the cladding CLI
cd <project>                # go to your project
clad setup                  # connect your AI tools (Claude / Codex / Gemini)

(b) Marketplace — for AI-tool plugin users

  1. Open the plugin marketplace inside your AI tool (Claude Code · Codex CLI · Gemini CLI)
  2. Search for cladding and install it
  3. No clad setup needed — the plugin manifest wires everything
Where clad setup connects (5 host channels)
Host (when detected) Wired location Auto-activation
Claude Code (~/.claude/) ~/.claude/plugins/cladding claude plugin marketplace add + claude plugin install claude-code@cladding
Codex CLI skills (~/.agents/) ~/.agents/skills/cladding-* (auto on Codex restart)
Codex CLI MCP server (~/.codex/) [mcp_servers.cladding] in ~/.codex/config.toml (TOML entry itself)
Gemini CLI (~/.gemini/) ~/.gemini/extensions/cladding gemini extensions link
Cursor (~/.cursor/) mcpServers.cladding in ~/.cursor/mcp.json (JSON entry itself)

clad setup invokes the per-host activation commands automatically when claude / gemini binaries are on PATH. Safe to re-run after a cladding upgrade or after installing another AI tool.

About the MCP server. Every host gets cladding wired as an MCP server — only the wire location differs. Claude Code and Gemini CLI auto-start it through the plugin/extension manifest's mcpServers field; Codex through ~/.codex/config.toml [mcp_servers.cladding]; Cursor through ~/.cursor/mcp.json. You never invoke MCP directly — no /mcp slash, no manual server-connect step. The AI in each host calls cladding's tools (clad_create_feature, etc.) in response to natural-language requests; you keep typing /cladding:init plus normal chat.

Benchmark. v0.4.0 measurements show ~60% consistency improvement and ~50% LOC reduction vs unguided AI coding on a fixed task, with 100% drift detection across a 5-iteration dev cycle. Full methodology and honest caveats (some of the consistency gain is the "more-specific-prompt" effect, not exclusively cladding) in docs/benchmarks/v0.4.0-consistency-bench.md.

Step 2 — Init (create the project spec)

Inside your project, run it once from your AI tool:

[inside your AI tool] /cladding:init "B2B payment SaaS"

This creates spec.yaml and the 4-tier docs. One-time per project.

Three init scenarios

/cladding:init takes a natural-language intent and picks the right path on its own. Same command, three starting points.

Starting point Command What happens
An idea, nothing else /cladding:init "I want to build a B2B payment SaaS" LLM infers the domain → spec · docs · policies generated, with 2–3 follow-up questions printed
A planning doc /cladding:init docs/plan.md cladding detects the file path, loads its contents, and uses them as the intent (absolute and relative paths both work)
Adopting into an existing project /cladding:init "apply cladding to this project" scans the existing code (≥3 source files trigger it) → observed patterns are merged with the intent

Init once, then carry on

cladding's goal is to be the infrastructure that prevents spec ↔ code drift — after init, you just keep coding. The AI references the spec while it writes, and clad check runs automatically in CI or as a pre-commit hook to block anything that drifts. No extra commands to remember.

Status

version
v0.4.0
2026-05
conformance
L4
top tier · self-declared
tests
973/973
all pass
coverage
93.89%+
enforced
features
136
spec'd

100 test files · installable from the Claude Code · OpenAI Codex · Gemini CLI marketplaces.

Road to Ironclad 1.0 — 1.0 locks when two independent implementations pass the L4 conformance fixtures (GOVERNANCE § 1). cladding is the first one.

Docs

License

MIT. LICENSE · Related: Ironclad (the standard cladding implements) · harness-boot (the seed project).

Reviews (0)

No results found