claude-deep-suite

agent
Security Audit
Fail
Health Warn
  • License — License: MIT
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Low visibility — Only 6 GitHub stars
Code Fail
  • rm -rf — Recursive force deletion command in examples/hooks-strict-mode/.claude/settings.json
Permissions Pass
  • Permissions — No dangerous permissions requested

No AI report is available for this listing yet.

SUMMARY

Harness layer for Claude Code & Codex: plan-first development, independent review, durable long-running orchestration, and knowledge capture — nine composable plugins.

README.md

English | 한국어

Deep Suite

License: MIT Plugins Runtimes

Cross-runtime plugin marketplace bundling nine plugins for structured development, knowledge management, autonomous experimentation, independent code review, documentation gardening, harness diagnostics, cross-project memory, goal-driven long-running execution, and loop-engineered orchestration across Claude Code and Codex.

Built on the Harness Engineering framework (Böckeler/Fowler, 2026) — Agent = Model + Harness — mapping Guides (feedforward) × Sensors (feedback) across Computational and Inferential control.

Plugins

Plugin Version Description
deep-work 6.9.0 Evidence-Driven Development Protocol
deep-wiki 1.7.0 LLM-native knowledge wiki
deep-evolve 3.4.2 Autonomous Experimentation Protocol
deep-review 1.12.2 Independent Evaluator
deep-docs 1.4.0 Document gardening + authoring
deep-dashboard 1.3.7 Cross-plugin harness diagnostics + suite telemetry
deep-memory 0.3.2 Cross-project semantic memory
deep-goal 1.0.1 Goal condition compiler
deep-loop 1.0.0 Loop Engineering control plane over the deep-suite

The table is auto-generated by scripts/generate-reference-sections.js from the marketplace manifest + pinned plugin.json.version. Narrative content (including this paragraph) is hand-curated.

Each plugin lives in its own Git repository: github.com/Sungmin-Cho/claude-deep-{name}. The claude-deep-* repository and marketplace identifiers are retained for compatibility with existing users; Codex support is exposed through the Codex marketplace mirror and each plugin's .codex-plugin/plugin.json.

Installation

Claude Code

# Install all plugins
/plugin marketplace add Sungmin-Cho/claude-deep-suite
/plugin install deep-work@Sungmin-Cho-claude-deep-suite
/plugin install deep-wiki@Sungmin-Cho-claude-deep-suite
/plugin install deep-evolve@Sungmin-Cho-claude-deep-suite
/plugin install deep-review@Sungmin-Cho-claude-deep-suite
/plugin install deep-docs@Sungmin-Cho-claude-deep-suite
/plugin install deep-dashboard@Sungmin-Cho-claude-deep-suite

# Or install only what you need — they work independently
/plugin install deep-work@Sungmin-Cho-claude-deep-suite

Codex

codex plugin marketplace add Sungmin-Cho/claude-deep-suite

Codex reads the native marketplace mirror at .agents/plugins/marketplace.json; each pinned plugin then exposes its Codex skills through .codex-plugin/plugin.json and skills/<skill>/SKILL.md.

Claude Code examples in this README use slash commands. In Codex, invoke the corresponding skill alias, for example:

Plugin Claude Code Codex
deep-work /deep-work "task" $deep-work:deep-work "task"
deep-evolve /deep-evolve $deep-evolve:deep-evolve
deep-review /deep-review-loop $deep-review:deep-review-loop
deep-docs /deep-docs scan $deep-docs:deep-docs scan
deep-wiki /wiki-ingest <source> $deep-wiki:wiki-ingest <source>
deep-dashboard /deep-harness-dashboard $deep-dashboard:deep-harness-dashboard

Harness Engineering Architecture

Deep Suite implements the Harness Engineering framework: each plugin occupies a specific role in the 2×2 matrix of Guides × Sensors, across Computational and Inferential control.

2x2 Matrix: where each plugin lives

                +---------------------------+---------------------------+
                |  Computational            |  Inferential              |
+---------------+---------------------------+---------------------------+
|               |                           |                           |
|  Guides       |  deep-work                |  deep-work                |
|  (Feedforward |  +- Phase Guard hook      |  +- research/plan/brain   |
|   Control)    |  +- TDD state machine     |  +- Sprint Contract       |
|               |  +- Topology templates    |                           |
|               |                           |  deep-wiki                |
|               |                           |  +- Persistent knowledge  |
|               |                           |                           |
|               |                           |  deep-docs                |
|               |                           |  +- Document freshness    |
+---------------+---------------------------+---------------------------+
|               |                           |                           |
|  Sensors      |  deep-work                |  deep-review              |
|  (Feedback    |  +- Linters + typecheck   |  +- Opus code review      |
|   Control)    |  +- Coverage + mutation   |  +- 3-way cross-model     |
|               |  +- 4 drift sensors       |  +- SOLID + entropy       |
|               |  +- Fitness rules         |                           |
|               |  +- review-check sensor   |  deep-work                |
|               |                           |  +- Drift check           |
|               |  deep-docs                |                           |
|               |  +- Doc freshness scan    |                           |
|               |                           |                           |
|               |  deep-dashboard           |                           |
|               |  +- Harnessability        |                           |
|               |  +- Effectiveness         |                           |
+---------------+---------------------------+---------------------------+

Plugin data flow

 deep-work ------- receipts -------> deep-dashboard (collector)
    |                                    |
    +-- health_report ----------------> deep-review (fitness-aware review)
    |                                    |
    +-- fitness.json <----------------> deep-review (rule consumption)
    |                                    |
 deep-docs ---- last-scan.json ---> deep-dashboard (collector)
    |                                    |
 deep-evolve -- evolve-receipt ----> deep-dashboard (collector)
    |                                    |
 deep-dashboard                          v
    +-- harnessability ---------------> deep-work Phase 1 (research context)
    +-- effectiveness ----------------> user (CLI report + optional markdown)

 deep-review -- recurring-findings -> deep-evolve (experiment steering)
 deep-evolve -- evolve-insights ---> deep-work (research context)
 deep-evolve -- review trigger ----> deep-review (pre-merge verification)

Integrated workflow

Each plugin works independently, but the real power comes from using them together:

Plugin Core question When to use
deep-work "How do I design and implement this?" Every code task — features, bugs, refactors
deep-evolve "Can I automatically make this better?" Performance optimization, test improvement
deep-review "Is this code actually good?" Pre-PR independent verification
deep-docs "Do docs match the code?" Post-change documentation sync
deep-wiki "How do I preserve what I learned?" Knowledge accumulation across sessions
deep-dashboard "Is the harness working well?" Project health diagnosis

Examples by complexity:

# Quick fix (30 min) — deep-work alone, skip Phase 5
/deep-work --skip-integrate "fix login 500 error"

# Medium feature (2-4 hours) — Phase 5 orchestrates review/docs/wiki
/deep-work "add Stripe payment integration"
# → Phase 5 recommends /deep-review, /deep-docs scan, /wiki-ingest (top-3 loop)

# Large optimization (half-day+) — full plugin stack
/deep-harness-dashboard                                  # diagnose project health
/deep-evolve "achieve 90% test coverage"                 # autonomous experiments
/wiki-ingest .deep-evolve/<session-id>/                  # preserve learnings

For detailed scenarios see the Integrated Workflow Guide.


deep-work

Evidence-Driven Development Protocol — a single-command auto-flow orchestration that enforces structured, evidence-based software development.

/deep-work "task" in Claude Code, or $deep-work:deep-work "task" in Codex, runs the entire Brainstorm → Research → Plan → Implement → Test → Integrate pipeline automatically. Claude Code uses hooks for physical phase enforcement; Codex uses the same skill protocol and verification gates through the Codex skill surface.

Key commands

Command Description
/deep-work <task> Auto-flow orchestration — runs the full pipeline. Plan approval is the only required interaction.
/deep-work --skip-integrate <task> Same, but skip Phase 5 and go straight to /deep-finish
/deep-integrate Manual Phase 5 — top-3 next actions from installed plugin artifacts
/deep-status Unified view — progress, report, receipts, history, assumptions
/deep-debug Systematic debugging with root cause investigation
/deep-research Manual Phase 1 — deep codebase analysis
/deep-plan Manual Phase 2 — slice-based implementation planning
/deep-implement Manual Phase 3 — TDD-enforced slice execution
/deep-test Phase 4 — verification with quality gates
/deep-sensor-scan Computational sensor scan — linter, type checker, coverage
/deep-mutation-test Mutation testing — AI-generated test quality verification

Workflow phases

Phase 0  Brainstorm    Design exploration — "why before how" (skippable)
Phase 1  Research      Deep codebase analysis and documentation
Phase 2  Plan          Slice-based implementation plan (requires user approval)
Phase 3  Implement     TDD-enforced execution — failing test → code → receipt
Phase 4  Test          Receipt check, spec compliance, quality gates
Phase 5  Integrate     Reads installed plugin artifacts → LLM ranks next actions
                       → user picks from top-3 (≤5 rounds, skippable)

Key features

  • Phase-locked file editing — code changes blocked outside Phase 3
  • TDD enforcement — failing test first, then implementation
  • Receipt-based evidence — every slice collects proof of completion (M3 cross-plugin envelope)
  • Quality gates — drift check, SOLID review, insight analysis, Sensor Clean, Mutation Score
  • Computational sensors — auto-run linter / type checker / coverage with self-correction loop (SENSOR_RUN → SENSOR_FIX → SENSOR_CLEAN)
  • Mutation testing — auto-verify AI-generated test quality; survived mutants trigger automatic test regeneration (up to 3 rounds)
  • Slice review — per-slice two-stage independent review (spec compliance + code quality) immediately after the sensor pipeline
  • Phase 5 integrate — AI-recommended top-3 next actions (review / docs / wiki / dashboard / evolve) after Test, with an interactive loop
  • Team/solo delegation — Research and Implement always delegate to subagents; team mode runs 3-way parallel Research
  • Profile schema v3 — per-item ask each session; atomic write + flock + idempotent migration from v2

Full documentation →


deep-wiki

LLM-managed markdown wiki for persistent knowledge accumulation — an implementation of Karpathy's LLM Wiki philosophy.

Instead of re-discovering knowledge each time (RAG), the active agent incrementally builds and maintains a persistent wiki. When you add a new source, the LLM reads it, extracts key information, and integrates it into the existing wiki. The knowledge compounds over time.

Key commands

Command Description
/wiki-setup <path> Initialize wiki directory structure
/wiki-ingest <source> Read a source (URL, file, text) and create/update wiki pages
/wiki-query <question> Search the wiki and generate grounded answers with citations
/wiki-lint Health check — schema violations, orphan pages, broken links
/wiki-rebuild Regenerate index from page frontmatter

Architecture

Raw Sources  →  Wiki (markdown pages)  →  Schema (management rules)

Key features

  • Flat pages — tags replace categories; no broken links from moves
  • Auto-lint — runs after every ingest and rebuild
  • Auto-filing — query results that synthesize 2+ pages are filed back into the wiki
  • Obsidian-compatible — works as an Obsidian vault
  • Subagent delegation for page I/O — every ingest dispatches to a dedicated wiki-synthesizer-{analysis,worker} agent that owns source reading, create-vs-update judgment, and page writing; main session keeps only the small metadata footprint (index.json, log.jsonl, sources/*.yaml)
  • Trust-boundary closure — active synthesizer agents have Write/Edit physically removed from their tool manifests; main session is the sole writer under a single global lock
  • Auto-ingest hook — in Claude Code, the SessionStart hook detects modified .md files in the vault and triggers /wiki-ingest automatically; opt-in via auto_ingest: config block. Codex uses the explicit $deep-wiki:wiki-ingest skill entry.
  • M3 envelope adoptionindex.json is wrapped in the cross-plugin envelope for traceability; legacy payload preserved verbatim for forward-compat

Full documentation →


deep-evolve

Autonomous Experimentation Protocol — specify a goal, and deep-evolve systematically improves your project through measured experiment loops.

Cross-plugin feedback is baked in: recurring findings from deep-review steer experiment direction, evolve-insights feed into deep-work research context, and deep-evolve triggers deep-review for pre-merge verification. AAR-inspired layers add entropy tracking, a legibility gate, a shortcut detector, and diagnose-retry. Virtual parallel exploration runs N=1..9 independent seed worktrees coordinated by an adaptive scheduler over a shared forum, with session-end synthesis merging per-seed results into a single best branch.

Key commands

Command Description
/deep-evolve init Initialize a new evolve session for a goal
/deep-evolve Resume the active session
/deep-evolve --review Trigger deep-review for pre-merge verification

Key features

  • Goal-driven experiment loops — each iteration measures fitness deltas
  • Cross-plugin feedback — consumes deep-review findings, emits insights to deep-work
  • Virtual parallel N-seed exploration — N=1..9 worktrees coordinated by an adaptive scheduler
  • AAR layers — entropy tracking, legibility gate, shortcut detector, diagnose-retry
  • M3 envelope adoption — evolve-receipt + evolve-insights as cross-plugin envelopes

Full documentation →


deep-review

Independent Evaluator — reviews AI coding agent output with a separate Claude reviewer. Claude Code uses the Agent tool path; Codex uses the Claude CLI reviewer bridge. Runs 3-way cross-model verification when the Codex plugin is installed.

Inspired by Anthropic's Harness Design — structurally eliminates self-approval bias through Generator-Evaluator separation.

Review pipeline

Collect → Contract Check → Deep Review → Verdict
                            ├─ Claude reviewer (always)
                            ├─ codex:review (if available)
                            └─ codex:adversarial-review (if available)

Key commands

Command Description
/deep-review Review current changes with an independent Opus evaluator
/deep-review --contract Sprint Contract-based verification
/deep-review --entropy Entropy scan (code drift, pattern mismatch)
/deep-review --respond Evidence-based response to review findings (Phase 6 delegated to a Sonnet subagent)
/deep-review init Initialize project review rules

Key features

  • Independent evaluator — separate Claude reviewer with no Generator context
  • Codex Claude bridge — Codex runs the Claude reviewer through the plugin's CLI bridge instead of substituting a Codex subagent
  • Cross-model verification — 3-way parallel review when Codex is installed
  • Phase 6 subagent delegation/deep-review --respond IMPLEMENT phase runs in a dedicated phase6-implementer subagent per severity group; main session keeps Phase 1~5 judgment and verifies with fail-closed content-aware delta + pathspec-limited commit
  • Sprint Contract — structured success criteria verification
  • Environment adaptation — works in git / non-git, with / without Codex

Full documentation →


deep-docs

Document Gardening + Authoring Agent — validates freshness and auto-repairs agent instruction documents like CLAUDE.md and AGENTS.md, and (v1.4.0) generates or restructures missing/thin docs such as ARCHITECTURE.md.

Inspired by OpenAI's Harness Engineering — "a doc-gardening agent runs repeatedly, finds stale docs, and opens fix PRs."

Key commands

Command Description
/deep-docs scan Detect stale references, moved paths, outdated examples, and missing/thin doc gaps
/deep-docs garden Auto-fix with user confirmation; generate·restructure docs for detected authoring gaps
/deep-docs audit Quantitative document health report

Key features

  • Path-scoped freshness — checks if referenced code paths changed after the doc was last updated
  • Auto-fix / audit-only separation — mechanical fixes only; subjective checks are audit-only
  • Document authoring (v1.4.0) — detects missing/thin docs (missing-doc / thin-doc gaps) and generates or restructures CLAUDE.md / AGENTS.md / ARCHITECTURE.md
  • Durable scan artifact.deep-docs/last-scan.json with provenance (HEAD SHA, branch)
  • Scoring — size, freshness, reference accuracy, duplication

Full documentation →


deep-dashboard

Cross-plugin Harness Diagnostics — assesses codebase harnessability and aggregates sensor results from deep-work, deep-review, and deep-docs into a unified view.

Built on the Harness Engineering framework — the dashboard closes the feedback loop by measuring harness effectiveness across the entire deep-suite ecosystem.

Key commands

Command Description
/deep-harnessability Assess codebase harnessability — 6 dimensions, 0-10 score with recommendations
/deep-harness-dashboard Unified view — health, fitness, sessions, effectiveness, suggested actions

Harnessability dimensions

Dimension Weight What it measures
Type Safety 25% tsconfig strict, mypy strict, type hints
Module Boundaries 20% dep-cruiser config, organized src, entry points
Test Infrastructure 20% test framework, test files, coverage config
Sensor Readiness 15% linter, type checker, lock file
Linter & Formatter 10% eslint/ruff config, prettier/biome
CI/CD 10% CI config, CI runs tests

Key features

  • Health status — drift sensor results (dead-export, stale-config, dep-vuln, coverage-trend)
  • Fitness rules — architecture fitness function pass/fail
  • Session quality — recent 3 sessions average
  • Effectiveness score — weighted aggregate (0-10) with not_applicable redistribution
  • Action routing — suggested next action per finding
  • Markdown export — optional report file generation with user approval

Full documentation →


Suite Extensions (Sidecar Manifest)

Cross-plugin metadata that does not fit the official marketplace schemas lives in a sidecar at .claude-plugin/suite-extensions.json, validated by schemas/suite-extensions.schema.json (JSON Schema Draft 2020-12). Claude Code reads .claude-plugin/marketplace.json; Codex reads the mirrored .agents/plugins/marketplace.json.

Why a sidecar? marketplace.json is closed (additionalProperties: false) — adding suite-only fields like runtime, capabilities, artifacts would either be ignored or rejected by claude plugin validate. The sidecar pattern keeps the official manifest clean while suite tooling consumes the extra metadata.

What's in it:

  • suite.name (must equal marketplace.json.name), harness_taxonomy, telemetry_namespace
  • plugins.<name> — per-plugin runtime, capabilities, artifacts.{writes,reads}, hooks_active, optional hooks_intentionally_empty_reason, optional consumer_only
  • data_flow[] — producer → consumer edges with display-only via labels (machine-readable cross-plugin trace lives in the M3 artifact envelope)

Schema versioning is locked at 1.0. Forward-compat additions go through x-* patternProperties at root, suite, and plugin entry levels. Breaking changes require a new schema file (suite-extensions-v2.schema.json). See schemas/README.md §Schema versioning.

A second schema, artifact-envelope.schema.json, defines the common envelope plugins adopt for cross-plugin traceability and deep-dashboard aggregation.

Validation:

npm install                       # ajv + ajv-formats (devDeps only)
npm test                          # unit + spawnSync CLI tests
npm run validate                  # validates the real .claude-plugin/suite-extensions.json

The validator runs in two phases — JSON Schema (Phase 1) and post-schema referential integrity for data_flow.from/to (Phase 2). Phase is reported via stderr prefix; exit code is 0/1/2 (valid / validation fail / IO-usage-compile error).


License

MIT

Reviews (0)

No results found