claude-markdown-health-check
Health Gecti
- License — License: MIT
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Community trust — 25 GitHub stars
Code Basarisiz
- rm -rf — Recursive force deletion command in commands/claude-markdown-health-check/evals/05-unflagged-destructive.json
Permissions Gecti
- Permissions — No dangerous permissions requested
Bu listing icin henuz AI raporu yok.
Audit your Claude Code .claude/ setup — scans skills, commands, hooks, agents, and settings for dead refs, weak triggers, token bloat, and skill-listing-budget overflow.
/claude-markdown-health-check
Tracks the live Claude Code spec — thresholds are fetched from the official docs and cached weekly, so the audit never drifts from the product.
A .claude/ ecosystem auditor for Claude Code. One slash command scans your skills, commands, hooks, agents, settings, plugins, and auto-memory — across both the user tree (~/.claude) and the project tree (./.claude) — and prints one flat, prioritized health report.
It finds dead references, weak or mismatched triggers, token bloat, skill-listing-budget overflow, frontmatter violations, dormant skills, failing hooks, drifted permissions, dead memory links, and per-session context bloat.
It reports first and waits. Nothing is edited, moved, or deleted until you reply naming which findings to fix — that autonomy gate is built into the command.
Demo
Demo coming soon. Run
/claude-markdown-health-checkto see the scorecard and area-grouped findings render in your own terminal.
What it checks
| Area | Examples of findings |
|---|---|
| Skills | missing or oversized description, frontmatter name ≠ directory, triggers that don't match real usage, oversized SKILL.md with no references/, missing Examples / Troubleshooting sections, dead internal paths |
| Skill-listing budget | cumulative description + when_to_use block exceeding Claude Code's 1%-of-context budget; low-relevance and duplicate-domain skills |
| Skill usage | dormant skills (no fires in 30d), never-fired skills, misfiring skills (loaded but no follow-through), orphan ledger entries |
| Skill–tool contract | tools declared in allowed-tools but never called, tools called but not declared |
| Frontmatter schema | description too short, model not in whitelist (opus/sonnet/haiku/fable families), allowed-tools malformed, unknown fields |
| Hooks | files on disk not registered in settings.json, duplicate logic, suspicious timeouts, matchers that match no real tool |
| Hook reliability | high failure-rate hooks, hooks registered but never fired, event-type mismatches |
| Hook safety | hook script with no #! shebang, a script that emits a block/deny decision but exits 1 (non-blocking) instead of 2, eval of dynamic input, an http hook leaking the whole environment (auth header, no allowedEnvVars) |
| Agents | triggers unreachable from CLAUDE.md, overlapping agents, agents on disk never spawned in 30d |
| Agent frontmatter | subagent schema violations — bad model/color/permissionMode/tools value, missing description, duplicate agent name, permissionMode: bypassPermissions, plugin agents declaring silently-ignored hooks/mcpServers/permissionMode |
| Settings | malformed JSON, duplicate JSON keys, duplicate array entries, MCP servers missing from preApprovedTools, over-broad Bash patterns, stale reminders, risky security keys (defaultMode: bypassPermissions, enableAllProjectMcpServers: true) |
| Permission hygiene | dead allowlist entries (zero grants), over-broad :* patterns, name collisions between commands/ and skills/ |
| Plugins & MCP | installed_plugins.json entries with missing installPath, missing plugin.json manifest, version drift between manifest and on-disk, deprecated sse MCP transport in .mcp.json |
| Plugin structure | (when auditing a plugin root) component dir misplaced inside .claude-plugin/, missing or non-semver version, component path not relative-with-./, dangling marketplace.json source |
| Output styles | outputStyle setting naming a non-existent (and non-built-in) style |
| Cross-references | dead paths in settings.json / CLAUDE.md / skill references/, orphaned guides and patterns, missing triggers |
| Reference graph | cycles in references/*.md, depth exceeding MAX_REF_DEPTH, orphan reference files (in-degree 0) |
| Memory | MEMORY.md over the loaded-slice line/byte budget |
| Memory hygiene | dead - [Title](file.md) links, orphan files in memory dir, duplicate entries, stale dates (>365d) |
| CLAUDE.md imports | dead @path imports, @import chains past the 4-hop limit, a CLAUDE.local.md not covered by .gitignore |
| Context trend (Deep depth) | low cache-hit sessions, output bloat per session |
| Cross-session patterns (Deep depth) | recurring tool denials, recurring user corrections, missing skill gaps (subagents repeatedly spawned with no matching skill) |
Thresholds — line counts, description caps, budget fractions, hook timeouts — are pulled live from the official Anthropic docs and cached for a week, so the audit tracks the spec instead of hardcoding it.
Severity tiers
- Critical — broken; blocks correct behavior (dead refs, unregistered hooks, budget overflow, dormant-and-expensive skills, broken plugin refs, dead memory links)
- Structural — works but should be reorganized (weak descriptions, orphans, trigger mismatches, never-fired skills, failing hooks, recurring denials)
- Hygiene — cosmetic / token efficiency (over-broad patterns, stale reminders, low cache-hit, unused declared tools)
- Discovery — additive suggestions surfaced from the current session (new rules, patterns, triggers)
The chat report groups findings by area (Skills, Hooks, Settings & Permissions, Memory, References, Plugins, CLAUDE.md, …) under a scorecard, each rendered as a plain-language line with a [must-fix] / [should] / [polish] chip; the canonical tag stays as a trailing machine code (e.g. · DEAD-REF). See references/report-format.md.
Before the report prints, every judgment finding (the heuristic calls — weak descriptions, orphaned guides, stale CLAUDE.md commands, …) passes an evidence-grounding gate:
it survives only if it can be grounded in a quoted artifact on disk (a line, a resolved-or-missing path, a metric), and [must-fix] / [should] findings then carry that proof as an Evidence: locator.
A finding that can't be grounded is downgraded to a non-actionable [OBSERVATION] or dropped — so a naive false positive (flagging a guide that is referenced, or a command that does exist) never reaches the report.
Deterministic scanner findings skip the gate — the script is already the proof. See references/finding-verification.md.
Install
Plugin (recommended)
In Claude Code, add the marketplace and install:
/plugin marketplace add ncoevoet/claude-markdown-health-check
/plugin install claude-markdown-health-check@ncoevoet-health-check
/claude-markdown-health-check is available right away.
Update with /plugin update claude-markdown-health-check@ncoevoet-health-check, remove with /plugin uninstall claude-markdown-health-check@ncoevoet-health-check.
CLI equivalents: claude plugin marketplace add ncoevoet/claude-markdown-health-check then claude plugin install claude-markdown-health-check@ncoevoet-health-check.
The plugin is self-contained — the command resolves its scripts and reference docs from ${CLAUDE_PLUGIN_ROOT}, so make install is not needed.
Manual (make install)
For hacking on the command itself, install it straight into ~/.claude/:
git clone https://github.com/ncoevoet/claude-markdown-health-check.git
cd claude-markdown-health-check
make install
make install copies five things into ~/.claude/:
commands/claude-markdown-health-check.md→~/.claude/commands/- the reference docs →
~/.claude/claude-markdown-health-check/references/ validate-skills.sh→~/.claude/commands/scripts/scan-graph.sh→~/.claude/commands/scripts/scan-history.sh→~/.claude/commands/scripts/
make uninstall removes the command and its reference tree. The bundled scripts are left in place — they live in a shared directory and other commands may depend on them.
make smoke-scan refreshes both scan caches and prints their meta blocks — useful for verifying install before running the full audit.
Either install works. As a plugin, the command resolves its scripts and references from
${CLAUDE_PLUGIN_ROOT}(the installed plugin directory), so it is self-contained — nomake installstep is needed. Withmake installit falls back to the~/.claude/paths the Makefile creates. Pick the plugin for a one-step install,make installfor local development on the command.
This command works in Claude Code only — it depends on filesystem access and bash.
Use
Inside Claude Code:
/claude-markdown-health-check
| Argument | Effect |
|---|---|
| (empty) | Audits both ~/.claude and any ./.claude; depth auto-selected from ecosystem size |
quick |
Fast pass — validator + budget audit + frontmatter / name-collision checks + spot-check 3 highest-risk skills |
deep |
Full audit including cross-session pattern mining and per-session token trend |
--refresh |
Re-fetch threshold values from the Anthropic docs instead of using the week-long cache |
--compress-bodies |
Opt-in caveman:lite rewrite of skill / rule / reference bodies that pass the filler-density gate; requires the caveman plugin |
--window-days=N |
Override the 30-day window used by history-driven phases (7, 9, 15, 16, 19, 22, 23) |
| any other text | Treated as a focus message — that topic becomes the #1 priority, and the session is scanned for violations of it |
Examples:
/claude-markdown-health-check
/claude-markdown-health-check quick
/claude-markdown-health-check deep
/claude-markdown-health-check --refresh
/claude-markdown-health-check --compress-bodies
/claude-markdown-health-check --window-days=7
/claude-markdown-health-check check that every skill has a Troubleshooting section
The report prints in chat. Reply naming the findings to fix and the command applies them; until then it touches nothing.
Configuration (optional)
Drop a markdown-health-check.json in ~/.claude/ (user defaults) and/or ./.claude/ (project overrides) to make tuning persistent. All keys are optional; a CLI argument always wins over the file. Precedence is CLI > project > user > default.
| Key | Default | Effect |
|---|---|---|
windowDays |
30 |
History window for the telemetry phases (same as --window-days=N) |
depth |
"auto" |
Depth floor — "auto", "quick", "standard", "deep" (a quick/deep arg overrides it) |
verifyFindings |
true |
Run the evidence-grounding gate over judgment findings; false emits them unverified (debug) |
skipPhases |
[] |
Phase numbers to skip (Phase 5, the deterministic spine, never skips) |
compressBodies |
false |
Persistent equivalent of --compress-bodies |
severityFloor |
"polish" |
Lowest chip to report — "should" hides [polish]; "must-fix" hides [polish] + [should] |
maxFindingsPerDomain |
0 |
Per-domain finding cap (0 = unlimited); excess is summarised, never silently dropped |
guidanceCacheTtlDays |
7 |
TTL before the threshold cache re-fetches the Anthropic docs |
{ "windowDays": 14, "severityFloor": "should", "skipPhases": [23] }
Full per-key rationale: references/config-keys.md.
How it works — phases
The phase sequence runs flat from 1 to 25, renumbered from the previous 5a / 5b / 5.5 / 7a scheme — see the Migration note below. Phase 26 (Output Styles) was added later; it runs in the scan band and feeds the Phase 24 report like the other scanners.
| Phase | What it does | Depth |
|---|---|---|
| 1 — Load Config + Thresholds | Reads optional markdown-health-check.json, then fetches skill / memory / settings / hooks limits from the Anthropic docs; caches at ~/.claude/.cache/claude-markdown-health-check-guidance.json |
All |
| 2 — Plugin + MCP Integrity | installed_plugins.json vs on-disk cache: broken refs, missing manifests, version drift; deprecated sse MCP transport in .mcp.json |
Standard + Deep |
| 3 — Select Depth | Picks Quick / Standard / Deep from the argument and the size of your ecosystem | All |
| 4 — Focus + History | Reads the focus message (if any) and mines the current session for recurring bugs, corrections, uncovered patterns | Standard + Deep |
| 5 — Run validate-skills.sh | Deterministic layer: name regex, line counts, voice, TOC, description sizes, frontmatter schema, name collisions | All |
| 6 — Skill Listing Budget | Cumulative skill-listing block vs Claude Code's runtime budget | Standard + Deep |
| 7 — Skill Usage Metrics | Per-skill 30-day invocation, dormancy, misfiring, orphan detection from ~/.claude/projects/**/*.jsonl + ~/.claude.json#skillUsage |
Standard + Deep |
| 8 — Skill Semantic Audit | Judgment-call checks the validator can't do — trigger quality, structure, resolvability | Standard + Deep |
| 9 — Skill–Tool Contract | Declared allowed-tools vs actually-called tools, per skill |
Standard + Deep |
| 10 — Frontmatter Strict Schema | Description min length, model whitelist, allowed-tools syntax, unknown fields (runs inside Phase 5) |
All |
| 11 — Reference Graph Health | Cycles, depth violations, orphan ref files | Standard + Deep |
| 12 — CLAUDE.md Content Quality | Whether each CLAUDE.md actually helps — stale commands, generic boilerplate, thin coverage | Standard + Deep |
| 13 — Body Compression | Detects high-filler bodies; --compress-bodies opens the opt-in caveman:lite rewrite path |
Standard + Deep |
| 14 — Hooks, Agents, Settings | Registration, duplication, timeouts, broad patterns, stale reminders, risky security keys (bypassPermissions, enableAllProjectMcpServers) |
Standard + Deep |
| 15 — Permission Allowlist Hygiene | Dead entries, over-broad :* patterns |
Standard + Deep |
| 16 — Hook Latency + Reliability | Per-hook failure-rate, never-fired hooks, event-type mismatches | Standard + Deep |
| 17 — Cross-references + Orphans | Dead paths, orphaned guides/patterns, missing triggers, memory-index overflow | Standard + Deep |
| 18 — Orphan Repurposing | For each orphan, propose repurposing into an existing skill before deletion | Standard + Deep |
| 19 — Cross-Session Pattern Mining | Recurring denials, correction clusters, missing-skill gaps | Deep |
| 20 — Auto-memory Hygiene | Dead - [Title](file.md) links, orphan files, duplicates, stale dates |
Standard + Deep |
| 21 — Name Collisions | Same basename in commands/ and skills/ (runs inside Phase 5) |
All |
| 22 — Agents Never-Spawned | Agents on disk never invoked in window | Standard + Deep |
| 23 — Token Trend | Per-session input/output/cache tokens — low cache-hit, output bloat | Deep |
| 24 — Report | A mandatory pre-print pass first grounds every judgment finding (drop / downgrade / keep-with-Evidence:), then renders a scorecard + findings grouped by area, each a plain-language line with a must-fix / should / polish chip and a trailing tag code; optional summary blocks per active phase |
All |
| 25 — Post-Report Menu | Pick a fix scope, apply, re-validate, loop until done | All |
| 26 — Output Styles | .claude/output-styles/*.md vs the selected outputStyle: flags a selection with no matching style file (runs in the scan band, feeds the Phase 24 report) |
Standard + Deep |
Migration note
If you previously referred to phases by the old letter scheme, here is the mapping to the new flat 1..25 numbering:
| Old | New |
|---|---|
| 1 (Load Thresholds) | 1 |
| 2 (Select Depth) | 3 |
| 3 (Focus + History) | 4 |
| 4 (validate-skills.sh) | 5 |
| 5a (Listing Budget) | 6 |
| 5 (Skill Semantic) | 8 |
| 5b (CLAUDE.md Quality) | 12 |
| 5.5 (Body Compression) | 13 |
| 6 (Hooks/Agents/Settings) | 14 |
| 7 (Cross-references) | 17 |
| 7a (Orphan Repurposing) | 18 |
| 8 (Report) | 24 |
| 9 (Post-Report Menu) | 25 |
Testing
Two layers, following Anthropic's develop-tests methodology (code-grading is the fastest, most reliable tier — so the bulk is code-graded, and LLM-grading is reserved for the judgment phases):
- Deterministic (code-graded, CI-safe, no API key). Synthetic
.claude/fixture trees undertests/fixtures/<case>/each plant one defect; the suite runsvalidate-skills.sh/scan-graph.shagainst them and asserts the exact[TAG]set. Aclean/fixture asserts zero findings — the false-positive guard.make test # bash tests/run.sh — anonymization + eval-schema gates, then 54 code-graded cases (177 assertions) bash tests/run.sh 02 # run one case / id-prefix (deterministic suite only)tests/run.shalso runs two release gates first: an anonymization check (no real scanned-project names in the publishedcommands/,tests/fixtures/,README.md— the real blocklist is gitignored, a placeholder ships) and eval-schema validation (validate-evals.shasserts every case matches the contract before an expensive run is wasted on a malformed one). - Behavioural (LLM-graded, opt-in, costs tokens). Runs the full
/claude-markdown-health-checkheadless against a fixture to exercise the judgment phases (weak descriptions, thin CLAUDE.md, autonomy-gate compliance) and the evidence-grounding gate — including paired guards (cases 36–41: a referenced guide and a live CLAUDE.md command must not be flagged, while a genuinely-orphaned guide and a missing-script command must still be), graded by an LLM rubric and scored by majority over N runs.make evals # needs the authenticated `claude` CLI HEALTH_CHECK_EVAL_RUNS=3 make evals # majority vote to smooth LLM noise
Cases live in commands/claude-markdown-health-check/evals/*.json (63 cases: 54 grader.method: code + 9 llm-rubric); fixtures in tests/fixtures/.
Tags are the stable machine contract, so the code-graded cases are immune to report-format changes. CI (.github/workflows/ci.yml) runs shellcheck + bash -n + the anonymization gate + eval-schema validation + the deterministic suite on every push; it does not run the token-spending LLM evals. Every real-world miss or false positive should become a new case.
Requirements
- Claude Code CLI
bash,awk,grep,find,date(defaults on macOS/Linux)jq— required for the history-aware and graph phases (plugin integrity, skill usage, hook reliability, etc.); strongly recommended for the existing skill-listing-budget check- Optional:
nprocfor parallel JSONL scanning (falls back to 4 workers if absent) - For development:
jqformake test; the authenticatedclaudeCLI formake evals;shellcheck(CI uses it at-S warning)
Layout
commands/
├── claude-markdown-health-check.md # the slash command (~440 lines, orchestrator)
├── claude-markdown-health-check/
│ ├── references/
│ │ ├── config-keys.md # Phase 1 config schema (markdown-health-check.json)
│ │ ├── skill-listing-budget.md # Phase 6 audit logic
│ │ ├── skill-usage-metrics.md # Phase 7
│ │ ├── skill-tool-contract.md # Phase 9
│ │ ├── frontmatter-schema.md # Phase 10
│ │ ├── agent-frontmatter.md # Phase 14 — subagent schema
│ │ ├── hook-safety.md # Phase 14 — hook script safety
│ │ ├── reference-graph.md # Phase 11
│ │ ├── claude-md-quality.md # Phase 12 rubric
│ │ ├── body-compression.md # Phase 13 logic
│ │ ├── permission-hygiene.md # Phase 15
│ │ ├── hook-reliability.md # Phase 16
│ │ ├── cross-session-patterns.md # Phase 19 + 22
│ │ ├── memory-hygiene.md # Phase 20
│ │ ├── plugin-integrity.md # Phase 2
│ │ ├── token-trend.md # Phase 23
│ │ ├── output-styles.md # Phase 26
│ │ ├── finding-verification.md # Pre-print evidence-grounding gate (judgment findings)
│ │ ├── report-format.md # Phase 24 report rendering — domain map + scorecard
│ │ └── post-report-menu.md # Phase 25 menu
│ └── evals/ # data-driven eval cases
│ ├── 01-clean-zero-findings.json … 65-ref-bare-sibling.json (54 code + 9 LLM)
│ └── README.md # eval schema + how to run
└── scripts/
├── validate-skills.sh # deterministic compliance validator (Phase 5)
├── scan-graph.sh # static graph scanner (Phases 2, 11, 20, 26)
├── scan-history.sh # session-log miner (Phases 7, 9, 15, 16, 19, 22, 23)
├── validate-evals.sh # eval-case schema/contract gate (CI)
├── run-evals-headless.sh # opt-in LLM-graded eval runner
└── run-evals.sh # manual eval runner
tests/ # deterministic test suite (dev-only, CI)
├── run.sh # entrypoint → anonymization + eval-schema + scanners
├── check-anonymization.sh # release gate: no real names in published artifacts
├── anonymization-blocklist.example.txt # placeholder blocklist (real one is gitignored)
├── lib.sh # assert helpers + tag extractors
├── test_scripts.sh # data-driven runner over evals/*.json
└── fixtures/<case>/.claude/… # synthetic trees, one planted defect each
.github/workflows/ci.yml # shellcheck + bash -n + tests/run.sh (anon + eval-schema + scanners)
All plain Markdown and shell — read, fork, extend.
License
MIT — see LICENSE.
Yorumlar (0)
Yorum birakmak icin giris yap.
Yorum birakSonuc bulunamadi