English | 한국어

Deep Suite

Name: claude-deep-suite
Author: Sungmin-Cho

Cross-runtime plugin marketplace bundling nine plugins for structured development, knowledge management, autonomous experimentation, independent code review, documentation gardening, harness diagnostics, cross-project memory, goal-driven long-running execution, and loop-engineered orchestration across Claude Code and Codex.

Built on the Harness Engineering framework (Böckeler/Fowler, 2026) — Agent = Model + Harness — mapping Guides (feedforward) × Sensors (feedback) across Computational and Inferential control.

Plugins

Plugin	Version	Description
deep-work	6.9.0	Evidence-Driven Development Protocol
deep-wiki	1.7.0	LLM-native knowledge wiki
deep-evolve	3.4.2	Autonomous Experimentation Protocol
deep-review	1.12.2	Independent Evaluator
deep-docs	1.4.0	Document gardening + authoring
deep-dashboard	1.3.7	Cross-plugin harness diagnostics + suite telemetry
deep-memory	0.3.2	Cross-project semantic memory
deep-goal	1.0.1	Goal condition compiler
deep-loop	1.0.0	Loop Engineering control plane over the deep-suite

The table is auto-generated by scripts/generate-reference-sections.js from the marketplace manifest + pinned plugin.json.version. Narrative content (including this paragraph) is hand-curated.

Each plugin lives in its own Git repository: github.com/Sungmin-Cho/claude-deep-{name}. The claude-deep-* repository and marketplace identifiers are retained for compatibility with existing users; Codex support is exposed through the Codex marketplace mirror and each plugin's .codex-plugin/plugin.json.

Installation

Claude Code

# Install all plugins
/plugin marketplace add Sungmin-Cho/claude-deep-suite
/plugin install deep-work@Sungmin-Cho-claude-deep-suite
/plugin install deep-wiki@Sungmin-Cho-claude-deep-suite
/plugin install deep-evolve@Sungmin-Cho-claude-deep-suite
/plugin install deep-review@Sungmin-Cho-claude-deep-suite
/plugin install deep-docs@Sungmin-Cho-claude-deep-suite
/plugin install deep-dashboard@Sungmin-Cho-claude-deep-suite

# Or install only what you need — they work independently
/plugin install deep-work@Sungmin-Cho-claude-deep-suite

Codex

codex plugin marketplace add Sungmin-Cho/claude-deep-suite

Codex reads the native marketplace mirror at .agents/plugins/marketplace.json; each pinned plugin then exposes its Codex skills through .codex-plugin/plugin.json and skills/<skill>/SKILL.md.

Claude Code examples in this README use slash commands. In Codex, invoke the corresponding skill alias, for example:

Plugin	Claude Code	Codex
deep-work	`/deep-work "task"`	`$deep-work:deep-work "task"`
deep-evolve	`/deep-evolve`	`$deep-evolve:deep-evolve`
deep-review	`/deep-review-loop`	`$deep-review:deep-review-loop`
deep-docs	`/deep-docs scan`	`$deep-docs:deep-docs scan`
deep-wiki	`/wiki-ingest <source>`	`$deep-wiki:wiki-ingest <source>`
deep-dashboard	`/deep-harness-dashboard`	`$deep-dashboard:deep-harness-dashboard`

Harness Engineering Architecture

Deep Suite implements the Harness Engineering framework: each plugin occupies a specific role in the 2×2 matrix of Guides × Sensors, across Computational and Inferential control.

2x2 Matrix: where each plugin lives

                +---------------------------+---------------------------+
                |  Computational            |  Inferential              |
+---------------+---------------------------+---------------------------+
|               |                           |                           |
|  Guides       |  deep-work                |  deep-work                |
|  (Feedforward |  +- Phase Guard hook      |  +- research/plan/brain   |
|   Control)    |  +- TDD state machine     |  +- Sprint Contract       |
|               |  +- Topology templates    |                           |
|               |                           |  deep-wiki                |
|               |                           |  +- Persistent knowledge  |
|               |                           |                           |
|               |                           |  deep-docs                |
|               |                           |  +- Document freshness    |
+---------------+---------------------------+---------------------------+
|               |                           |                           |
|  Sensors      |  deep-work                |  deep-review              |
|  (Feedback    |  +- Linters + typecheck   |  +- Opus code review      |
|   Control)    |  +- Coverage + mutation   |  +- 3-way cross-model     |
|               |  +- 4 drift sensors       |  +- SOLID + entropy       |
|               |  +- Fitness rules         |                           |
|               |  +- review-check sensor   |  deep-work                |
|               |                           |  +- Drift check           |
|               |  deep-docs                |                           |
|               |  +- Doc freshness scan    |                           |
|               |                           |                           |
|               |  deep-dashboard           |                           |
|               |  +- Harnessability        |                           |
|               |  +- Effectiveness         |                           |
+---------------+---------------------------+---------------------------+

Plugin data flow

 deep-work ------- receipts -------> deep-dashboard (collector)
    |                                    |
    +-- health_report ----------------> deep-review (fitness-aware review)
    |                                    |
    +-- fitness.json <----------------> deep-review (rule consumption)
    |                                    |
 deep-docs ---- last-scan.json ---> deep-dashboard (collector)
    |                                    |
 deep-evolve -- evolve-receipt ----> deep-dashboard (collector)
    |                                    |
 deep-dashboard                          v
    +-- harnessability ---------------> deep-work Phase 1 (research context)
    +-- effectiveness ----------------> user (CLI report + optional markdown)

 deep-review -- recurring-findings -> deep-evolve (experiment steering)
 deep-evolve -- evolve-insights ---> deep-work (research context)
 deep-evolve -- review trigger ----> deep-review (pre-merge verification)

Integrated workflow

Each plugin works independently, but the real power comes from using them together:

Plugin	Core question	When to use
deep-work	"How do I design and implement this?"	Every code task — features, bugs, refactors
deep-evolve	"Can I automatically make this better?"	Performance optimization, test improvement
deep-review	"Is this code actually good?"	Pre-PR independent verification
deep-docs	"Do docs match the code?"	Post-change documentation sync
deep-wiki	"How do I preserve what I learned?"	Knowledge accumulation across sessions
deep-dashboard	"Is the harness working well?"	Project health diagnosis

Examples by complexity:

# Quick fix (30 min) — deep-work alone, skip Phase 5
/deep-work --skip-integrate "fix login 500 error"

# Medium feature (2-4 hours) — Phase 5 orchestrates review/docs/wiki
/deep-work "add Stripe payment integration"
# → Phase 5 recommends /deep-review, /deep-docs scan, /wiki-ingest (top-3 loop)

# Large optimization (half-day+) — full plugin stack
/deep-harness-dashboard                                  # diagnose project health
/deep-evolve "achieve 90% test coverage"                 # autonomous experiments
/wiki-ingest .deep-evolve/<session-id>/                  # preserve learnings

For detailed scenarios see the Integrated Workflow Guide.

deep-work

Evidence-Driven Development Protocol — a single-command auto-flow orchestration that enforces structured, evidence-based software development.

/deep-work "task" in Claude Code, or $deep-work:deep-work "task" in Codex, runs the entire Brainstorm → Research → Plan → Implement → Test → Integrate pipeline automatically. Claude Code uses hooks for physical phase enforcement; Codex uses the same skill protocol and verification gates through the Codex skill surface.

Key commands

Command	Description
`/deep-work <task>`	Auto-flow orchestration — runs the full pipeline. Plan approval is the only required interaction.
`/deep-work --skip-integrate <task>`	Same, but skip Phase 5 and go straight to `/deep-finish`
`/deep-integrate`	Manual Phase 5 — top-3 next actions from installed plugin artifacts
`/deep-status`	Unified view — progress, report, receipts, history, assumptions
`/deep-debug`	Systematic debugging with root cause investigation
`/deep-research`	Manual Phase 1 — deep codebase analysis
`/deep-plan`	Manual Phase 2 — slice-based implementation planning
`/deep-implement`	Manual Phase 3 — TDD-enforced slice execution
`/deep-test`	Phase 4 — verification with quality gates
`/deep-sensor-scan`	Computational sensor scan — linter, type checker, coverage
`/deep-mutation-test`	Mutation testing — AI-generated test quality verification

Workflow phases

Phase 0  Brainstorm    Design exploration — "why before how" (skippable)
Phase 1  Research      Deep codebase analysis and documentation
Phase 2  Plan          Slice-based implementation plan (requires user approval)
Phase 3  Implement     TDD-enforced execution — failing test → code → receipt
Phase 4  Test          Receipt check, spec compliance, quality gates
Phase 5  Integrate     Reads installed plugin artifacts → LLM ranks next actions
                       → user picks from top-3 (≤5 rounds, skippable)

Key features

Phase-locked file editing — code changes blocked outside Phase 3
TDD enforcement — failing test first, then implementation
Receipt-based evidence — every slice collects proof of completion (M3 cross-plugin envelope)
Quality gates — drift check, SOLID review, insight analysis, Sensor Clean, Mutation Score
Computational sensors — auto-run linter / type checker / coverage with self-correction loop (SENSOR_RUN → SENSOR_FIX → SENSOR_CLEAN)
Mutation testing — auto-verify AI-generated test quality; survived mutants trigger automatic test regeneration (up to 3 rounds)
Slice review — per-slice two-stage independent review (spec compliance + code quality) immediately after the sensor pipeline
Phase 5 integrate — AI-recommended top-3 next actions (review / docs / wiki / dashboard / evolve) after Test, with an interactive loop
Team/solo delegation — Research and Implement always delegate to subagents; team mode runs 3-way parallel Research
Profile schema v3 — per-item ask each session; atomic write + flock + idempotent migration from v2

Full documentation →

deep-wiki

LLM-managed markdown wiki for persistent knowledge accumulation — an implementation of Karpathy's LLM Wiki philosophy.

Instead of re-discovering knowledge each time (RAG), the active agent incrementally builds and maintains a persistent wiki. When you add a new source, the LLM reads it, extracts key information, and integrates it into the existing wiki. The knowledge compounds over time.

Key commands

Command	Description
`/wiki-setup <path>`	Initialize wiki directory structure
`/wiki-ingest <source>`	Read a source (URL, file, text) and create/update wiki pages
`/wiki-query <question>`	Search the wiki and generate grounded answers with citations
`/wiki-lint`	Health check — schema violations, orphan pages, broken links
`/wiki-rebuild`	Regenerate index from page frontmatter

Architecture

Raw Sources  →  Wiki (markdown pages)  →  Schema (management rules)

Key features

Flat pages — tags replace categories; no broken links from moves
Auto-lint — runs after every ingest and rebuild
Auto-filing — query results that synthesize 2+ pages are filed back into the wiki
Obsidian-compatible — works as an Obsidian vault
Subagent delegation for page I/O — every ingest dispatches to a dedicated wiki-synthesizer-{analysis,worker} agent that owns source reading, create-vs-update judgment, and page writing; main session keeps only the small metadata footprint (index.json, log.jsonl, sources/*.yaml)
Trust-boundary closure — active synthesizer agents have Write/Edit physically removed from their tool manifests; main session is the sole writer under a single global lock
Auto-ingest hook — in Claude Code, the SessionStart hook detects modified .md files in the vault and triggers /wiki-ingest automatically; opt-in via auto_ingest: config block. Codex uses the explicit $deep-wiki:wiki-ingest skill entry.
M3 envelope adoption — index.json is wrapped in the cross-plugin envelope for traceability; legacy payload preserved verbatim for forward-compat

Full documentation →

deep-evolve

Autonomous Experimentation Protocol — specify a goal, and deep-evolve systematically improves your project through measured experiment loops.

Cross-plugin feedback is baked in: recurring findings from deep-review steer experiment direction, evolve-insights feed into deep-work research context, and deep-evolve triggers deep-review for pre-merge verification. AAR-inspired layers add entropy tracking, a legibility gate, a shortcut detector, and diagnose-retry. Virtual parallel exploration runs N=1..9 independent seed worktrees coordinated by an adaptive scheduler over a shared forum, with session-end synthesis merging per-seed results into a single best branch.

Key commands

Command	Description
`/deep-evolve init`	Initialize a new evolve session for a goal
`/deep-evolve`	Resume the active session
`/deep-evolve --review`	Trigger deep-review for pre-merge verification

Key features

Goal-driven experiment loops — each iteration measures fitness deltas
Cross-plugin feedback — consumes deep-review findings, emits insights to deep-work
Virtual parallel N-seed exploration — N=1..9 worktrees coordinated by an adaptive scheduler
AAR layers — entropy tracking, legibility gate, shortcut detector, diagnose-retry
M3 envelope adoption — evolve-receipt + evolve-insights as cross-plugin envelopes

Full documentation →

deep-review

Independent Evaluator — reviews AI coding agent output with a separate Claude reviewer. Claude Code uses the Agent tool path; Codex uses the Claude CLI reviewer bridge. Runs 3-way cross-model verification when the Codex plugin is installed.

Inspired by Anthropic's Harness Design — structurally eliminates self-approval bias through Generator-Evaluator separation.

Review pipeline

Collect → Contract Check → Deep Review → Verdict
                            ├─ Claude reviewer (always)
                            ├─ codex:review (if available)
                            └─ codex:adversarial-review (if available)

Key commands

Command	Description
`/deep-review`	Review current changes with an independent Opus evaluator
`/deep-review --contract`	Sprint Contract-based verification
`/deep-review --entropy`	Entropy scan (code drift, pattern mismatch)
`/deep-review --respond`	Evidence-based response to review findings (Phase 6 delegated to a Sonnet subagent)
`/deep-review init`	Initialize project review rules

Key features

Independent evaluator — separate Claude reviewer with no Generator context
Codex Claude bridge — Codex runs the Claude reviewer through the plugin's CLI bridge instead of substituting a Codex subagent
Cross-model verification — 3-way parallel review when Codex is installed
Phase 6 subagent delegation — /deep-review --respond IMPLEMENT phase runs in a dedicated phase6-implementer subagent per severity group; main session keeps Phase 1~5 judgment and verifies with fail-closed content-aware delta + pathspec-limited commit
Sprint Contract — structured success criteria verification
Environment adaptation — works in git / non-git, with / without Codex

Full documentation →

deep-docs

Document Gardening + Authoring Agent — validates freshness and auto-repairs agent instruction documents like CLAUDE.md and AGENTS.md, and (v1.4.0) generates or restructures missing/thin docs such as ARCHITECTURE.md.

Inspired by OpenAI's Harness Engineering — "a doc-gardening agent runs repeatedly, finds stale docs, and opens fix PRs."

Key commands

Command	Description
`/deep-docs scan`	Detect stale references, moved paths, outdated examples, and missing/thin doc gaps
`/deep-docs garden`	Auto-fix with user confirmation; generate·restructure docs for detected authoring gaps
`/deep-docs audit`	Quantitative document health report

Key features

Path-scoped freshness — checks if referenced code paths changed after the doc was last updated
Auto-fix / audit-only separation — mechanical fixes only; subjective checks are audit-only
Document authoring (v1.4.0) — detects missing/thin docs (missing-doc / thin-doc gaps) and generates or restructures CLAUDE.md / AGENTS.md / ARCHITECTURE.md
Durable scan artifact — .deep-docs/last-scan.json with provenance (HEAD SHA, branch)
Scoring — size, freshness, reference accuracy, duplication

Full documentation →

deep-dashboard

Cross-plugin Harness Diagnostics — assesses codebase harnessability and aggregates sensor results from deep-work, deep-review, and deep-docs into a unified view.

Built on the Harness Engineering framework — the dashboard closes the feedback loop by measuring harness effectiveness across the entire deep-suite ecosystem.

Key commands

Command	Description
`/deep-harnessability`	Assess codebase harnessability — 6 dimensions, 0-10 score with recommendations
`/deep-harness-dashboard`	Unified view — health, fitness, sessions, effectiveness, suggested actions

Harnessability dimensions

Dimension	Weight	What it measures
Type Safety	25%	tsconfig strict, mypy strict, type hints
Module Boundaries	20%	dep-cruiser config, organized src, entry points
Test Infrastructure	20%	test framework, test files, coverage config
Sensor Readiness	15%	linter, type checker, lock file
Linter & Formatter	10%	eslint/ruff config, prettier/biome
CI/CD	10%	CI config, CI runs tests

Key features

Health status — drift sensor results (dead-export, stale-config, dep-vuln, coverage-trend)
Fitness rules — architecture fitness function pass/fail
Session quality — recent 3 sessions average
Effectiveness score — weighted aggregate (0-10) with not_applicable redistribution
Action routing — suggested next action per finding
Markdown export — optional report file generation with user approval

Full documentation →

Suite Extensions (Sidecar Manifest)

Cross-plugin metadata that does not fit the official marketplace schemas lives in a sidecar at .claude-plugin/suite-extensions.json, validated by schemas/suite-extensions.schema.json (JSON Schema Draft 2020-12). Claude Code reads .claude-plugin/marketplace.json; Codex reads the mirrored .agents/plugins/marketplace.json.

Why a sidecar? marketplace.json is closed (additionalProperties: false) — adding suite-only fields like runtime, capabilities, artifacts would either be ignored or rejected by claude plugin validate. The sidecar pattern keeps the official manifest clean while suite tooling consumes the extra metadata.

What's in it:

suite.name (must equal marketplace.json.name), harness_taxonomy, telemetry_namespace
plugins.<name> — per-plugin runtime, capabilities, artifacts.{writes,reads}, hooks_active, optional hooks_intentionally_empty_reason, optional consumer_only
data_flow[] — producer → consumer edges with display-only via labels (machine-readable cross-plugin trace lives in the M3 artifact envelope)

Schema versioning is locked at 1.0. Forward-compat additions go through x-* patternProperties at root, suite, and plugin entry levels. Breaking changes require a new schema file (suite-extensions-v2.schema.json). See schemas/README.md §Schema versioning.

A second schema, artifact-envelope.schema.json, defines the common envelope plugins adopt for cross-plugin traceability and deep-dashboard aggregation.

Validation:

npm install                       # ajv + ajv-formats (devDeps only)
npm test                          # unit + spawnSync CLI tests
npm run validate                  # validates the real .claude-plugin/suite-extensions.json

The validator runs in two phases — JSON Schema (Phase 1) and post-schema referential integrity for data_flow.from/to (Phase 2). Phase is reported via stderr prefix; exit code is 0/1/2 (valid / validation fail / IO-usage-compile error).

License

MIT

Deep Suite

Plugins

Installation

Claude Code

Codex

Harness Engineering Architecture

2x2 Matrix: where each plugin lives

Plugin data flow

Integrated workflow

deep-work

Key commands

Workflow phases

Key features

deep-wiki

Key commands

Architecture

Key features

deep-evolve

Key commands

Key features

deep-review

Review pipeline

Key commands

Key features

deep-docs

Key commands

Key features

deep-dashboard

Key commands

Harnessability dimensions

Key features

Suite Extensions (Sidecar Manifest)

License

Reviews (0)