clybor-claude-tooling
Health Warn
- No license — Repository has no license file
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Low visibility — Only 5 GitHub stars
Code Pass
- Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Pass
- Permissions — No dangerous permissions requested
No AI report is available for this listing yet.
Standardized Claude Code bootstrap — drop-in .claude/ tree with a 19-agent adversarial review team, Ralph Loop, dev-flow skills, and an init script. Agnostic, project-ready.
clybor-claude-tooling
Standardized Claude Code bootstrap for new projects. Drop-in .claude/ tree + CLAUDE.md template + adversarial review team + ralph-loop skill validator.
Install / use
- Clone this repo:
git clone https://github.com/shawnclybor/clybor-claude-tooling.git ~/gits/clybor-claude-tooling - Create the target project dir (if it doesn't exist):
mkdir -p ~/gits/my-new-project - Run init:
bash ~/gits/clybor-claude-tooling/scripts/init.sh ~/gits/my-new-project "My New Project" cd ~/gits/my-new-projectand start Claude Code
The project-facing router template lives at templates/CLAUDE.md.template. scripts/init.sh copies it into the target project and substitutes {{PROJECT_NAME}}.
What you get
Lane 1 — Quality review team (3 agents)
| Agent | Model | Lens | Asks |
|---|---|---|---|
adversarial-reviewer |
Opus | Evidence and reasoning | Is this the right thing? |
simplifier |
Sonnet | KISS / YAGNI | Is this the simplest way? |
chaos-engineer |
Sonnet | Robustness and edge cases | What breaks this? |
The three lenses are non-overlapping. quality-review runs them in parallel and synthesizes findings.
Lane 2 — Code (4 agents)
| Agent | Model | Role |
|---|---|---|
code-reviewer |
Sonnet | Read-only review of a diff or file set |
debugger |
Sonnet | Reproduction-first bug diagnosis |
code-analyzer |
Sonnet | Cross-file logic-flow analysis |
security-auditor |
Opus | OWASP-style threat audit |
Lane 3 — Orchestration (8 agents)
| Agent | Model | Role |
|---|---|---|
workflow-orchestrator |
Sonnet | Sequences multi-stage workflows |
multi-agent-coordinator |
Sonnet | Tracks parallel agent state; handles partial failures |
agent-organizer |
Sonnet | Picks which agents fit a task |
task-distributor |
Sonnet | Splits work across parallel agents safely |
context-manager |
Sonnet | Manages shared context across handoffs |
error-coordinator |
Sonnet | Correlates failures to find shared root causes |
knowledge-synthesizer |
Sonnet | Combines multi-agent outputs |
performance-monitor |
Sonnet | Surfaces hotspots in agent runtime |
Lane 4 — Research (4 agents)
| Agent | Model | Role |
|---|---|---|
research-analyst |
Sonnet | Multi-source synthesis with cited claims |
search-specialist |
Haiku | Quick precision lookups |
evidence-auditor |
Sonnet | Verifies quotes and citations |
metadata-fetcher |
Haiku | Mechanical metadata lookups |
Skills — adversarial review
quality-review— orchestrates the 3-agent team against a plan, PRD, file, or proposalfive-whys— root-cause analysis when something breaks unexpectedly
Skills — ralph loops
ralph-loop— canonical Stop-hook ralph. Single prompt + completion-promise. Use for autonomous walk-away iteration./ralph-loopcommand.ralph-implement— task-driven sibling. Reads a task plan, supports parallel groups, structured escalation.skill-validator— ralph specialized for validating SKILL.md files.
Skills — development flow (run in sequence; each stage standalone)
| Stage | Skill |
|---|---|
| 1. PRD | prd-writer (uses templates/prd-template.md) |
| 2. Plan | task-plan (uses templates/plan-template.md) |
| 3. Validate plan | quality-review |
| 4. Implement | ralph-implement |
| 5. Verify | verify |
| 6. Evaluate impl | quality-review |
| 7. Iterate or close | (decision, no skill) |
Skills — other
writing-quality— strip AI-isms from client-facing proseinsight-crystallizer— captures valuable analyses intodocs/insights/*.mdso they survive past the chat sessioninsight-promotion— promotes a crystallized insight into always-on governance
Slash commands
/quality-review— full 3-agent review/adversarial,/simplify,/chaos— single-lens reviews/five-whys— debug protocol/ralph-loop— start a Stop-hook-driven autonomous ralph loop
Rules (auto-loaded)
routing-protocol.md— 5-step Classify→Load→Think→Pre-flight→Validate ritualkiss-yagni.md— principles, Two Strikes, Blocker Protocol, Cascade Re-Scope
Hooks
compact-recovery.sh— re-injects ROADMAP and recent commits after context-window compactionkiss-yagni-reminder.py— prints a one-line KISS / YAGNI checkpoint to stderr when writing code files (reminder, not block)ralph-stop.sh— Stop hook for the Ralph Loop. Reads.ralph-loop/state.json, scans transcript for completion-promise, blocks exit + re-feeds prompt or allows exit
User journeys
Five scenarios showing how the pieces compose.
1. New feature, full dev cycle
"I need to add rate limiting to the public API. Walk me through the right way."
Step 1 — Write the PRD. You invoke prd-writer. It asks for the problem statement, then walks you through writing success criteria that are binary — "P95 latency on /api/* stays under 200ms across 100 test requests" rather than "make it fast." If the original problem cites external claims (a vendor's rate-limit doc, a research paper), research-analyst and evidence-auditor run first so the PRD has verified citations. Output lands in docs/PRDs/rate-limiting.md.
Step 2 — Plan the work. task-plan reads the PRD and decomposes it into one-pass-completable tasks: schema/config first, then middleware, then tests, then docs. task-distributor analyzes the list and identifies which tasks can run safely in parallel (the tests for endpoint A and endpoint B can; the schema and the middleware that reads the schema cannot). The plan lands in docs/PRDs/rate-limiting-plan.md.
Step 3 — Validate the plan. /quality-review spawns three agents in parallel via multi-agent-coordinator. simplifier asks "is this the simplest decomposition?" — maybe a config flag is overkill for the first version. adversarial-reviewer challenges the plan's assumptions — "the PRD says burst tolerance is needed; the plan doesn't address it." chaos-engineer asks "what breaks this?" — clock skew, distributed counter races, header spoofing. knowledge-synthesizer combines the three reports into one. Critical findings go back to Step 2. High findings get addressed in the plan or explicitly accepted with rationale.
Step 4 — Implement task-by-task. ralph-implement works through the plan. For each task: it implements the smallest change that addresses the task, runs the per-task check (unit test, type check, lint), and on green invokes code-reviewer to catch issues the check itself doesn't — type-safety gaps, observability holes, structural debt. If a task fails twice with the same error, the loop stops and calls debugger, which reproduces the failure and narrows the cause with evidence before patching. If debugger can't find a root cause, five-whys halts the task and surfaces it to you. Parallel task groups get spawned via multi-agent-coordinator; error-coordinator correlates failures across them when they happen.
Step 5 — Verify against PRD criteria. verify runs every PRD success criterion through its corresponding check command. The latency criterion runs the perf test. The "API returns 429 on threshold breach" criterion runs the integration test. The security criterion fires security-auditor against the rate-limit middleware. If a PRD criterion has no check, the verdict is PARTIAL — you either write the check or strike the criterion before continuing.
Step 6 — Evaluate the implementation. /quality-review runs again, this time against the built code rather than the plan. Same three agents, different question: did we build the right thing well? Critical findings send a narrower task list back into ralph-implement. Clean → close with a short note in docs/PRDs/rate-limiting-closure.md: what shipped, what was deferred, follow-ups.
2. Autonomous overnight task
"Build the user-profile CRUD endpoints with tests. I'll check it in the morning."
Step 1 — Frame the prompt. The Ralph loop only knows when to stop if you tell it. Write the prompt with a clear completion criterion: every endpoint implemented, every test passing, then emit the exact string COMPLETE. Include what to do if stuck — "after 25 iterations document what's blocking and emit BLOCKED."
Step 2 — Launch the loop.
/ralph-loop "Implement the four CRUD endpoints in routes/profile/.
Write integration tests for each. Run the suite each iteration
and fix what fails. Output <promise>COMPLETE</promise> when all
tests pass." --completion-promise "COMPLETE" --max-iterations 30
This writes .ralph-loop/state.json and starts Claude working on the task.
Step 3 — The loop runs itself. Each time Claude tries to exit the session, the ralph-stop.sh Stop hook fires. It reads the state file, scans the last 200 lines of the transcript for the exact string COMPLETE. Not found → increment iteration counter, block the exit, re-feed the original prompt. Claude wakes up to the same prompt, sees its previous work in files, and continues. Iteration after iteration.
Step 4 — The loop exits. Three paths: Claude emits COMPLETE (hook clears state, allows exit), the counter hits 30 (hook clears state, allows exit with a halt message), or you delete .ralph-loop/state.json mid-run to cancel. No external bash loop. No tabs to monitor.
Step 5 — Morning checkout. You open the repo, read the closing transcript, scan the git history (one commit per substantive iteration), run the test suite to confirm it's green. If something went sideways, the transcript is the audit trail.
3. Production bug investigation
"The dedup job is silently skipping records. I've retried it twice with the same failure."
Step 1 — Two strikes triggers the protocol. Same operation failing twice with the same signal is the Two Strikes rule from kiss-yagni.md. You stop retrying and invoke /five-whys. It refuses to advance until you write a precise problem statement: "the dedup job processed 1,000 input rows, wrote 982 output rows, logged no errors, and there's no record of which 18 were skipped."
Step 2 — Walk the why-chain with Sequential Thinking. Each "why" is one thought. Why are 18 rows missing? → Because the dedup predicate returned the same hash for them. Why? → Because the hash uses a field that's nullable. Why? → Because the schema migration didn't reject null on that field. Why? → Because the validation step was disabled for the migration window and never re-enabled. Root cause: validation flag left disabled.
Step 3 — Reproduce before patching. debugger takes the root-cause hypothesis and reproduces it: synthesize 1,000 input rows with the null field on row 17, run the job, confirm row 17 (and 17 more like it) silently drops. Cited evidence: specific line in dedup/hash.py:42 where the null hashes to the same value as an empty string, and the migration log showing the validation flag never flipped back.
Step 4 — Trace the surface. code-analyzer maps the cross-file logic flow to confirm no other consumer relies on the disabled flag. It surfaces one more silent path: a separate report job uses the same field for filtering. So the fix needs to cover both consumers.
Step 5 — Fix and promote. The minimal fix is two lines: re-enable the validation flag and guard the hash function against null. But the lesson is bigger: validation flags should never be disabled without an automated re-enable check. insight-promotion runs that proposed rule through the 3-agent quality team, then adds it to .claude/rules/ so the next session that touches a migration sees the rule before disabling anything.
4. Pre-merge audit on a sensitive change
"This PR touches the auth middleware. I don't want to ship without a hard look."
Step 1 — Code-level review first. You point code-reviewer at the diff. It walks the checklist — correctness, type-safety, error handling, resource management, observability. Specific findings come back numbered, with file:line citations: "line 47 catches the JWT decode error but never logs the cause; failure mode is silent denial-of-access," "line 89 reads req.user before the auth check completes."
Step 2 — Security review on the same diff. security-auditor (Opus tier) walks a different axis: input validation, injection risk, secret handling, auth flow integrity, dependency surface. It models the attack scenario for each finding — "an attacker submits a forged JWT with an empty signature; the verify call short-circuits to true; full account takeover."
Step 3 — Design review on the bigger choice. Even if the diff is clean, the approach might be wrong. /quality-review runs the 3-agent team on the design itself. simplifier asks whether a built-in library would replace the custom middleware. adversarial-reviewer asks whether the chosen JWT lifetime contradicts the threat model the PRD assumed. chaos-engineer asks what happens under clock drift, key rotation, replay attacks. knowledge-synthesizer combines the three.
Step 4 — Merge gate. The merge rule is "no Critical findings; High findings either addressed in this PR or explicitly accepted with rationale logged in the PR description." Critical-level security findings always block. The findings table from each agent is the documentation of why this merged or didn't.
5. Research-backed decision doc
"Should we switch from polling to webhooks for the third-party integration? Write me the decision."
Step 1 — Cast the net. research-analyst reads the vendor's documentation, scans community threads about reliability, pulls public incident reports about webhook delivery guarantees, and synthesizes it into a structured claims table — every claim with a citation, every disagreement between sources preserved rather than smoothed over.
Step 2 — Precision lookups for the specific quirks. search-specialist handles the targeted questions: what's the exact retry policy in the vendor's webhook API? What's the maximum payload size? Is signature verification mandatory? Each lookup returns the exact quote plus the URL plus a retrieval date.
Step 3 — Audit the citations. Before any of this lands in a decision doc, evidence-auditor walks the claims table and verifies each citation actually says what it's claimed to say. Fabricated URLs, paraphrased "quotes," and out-of-context excerpts get flagged. Anything FLAGGED gets fixed or struck before the doc moves forward.
Step 4 — Draft the decision. With the cited claims locked in, you draft the recommendation. writing-quality audits the draft for AI-isms — drops "leverage," "robust," "comprehensive"; flattens the rule-of-three patterns; rewrites the formulaic conclusion. The output sounds like a competent human wrote it.
Step 5 — Make it persistent. A decision that lives only in chat history will be re-litigated in three months when someone forgets why it was made. insight-crystallizer files the decision to docs/insights/<date>-webhooks-vs-polling.md with the rationale, the sources, the trade-offs that were considered, and the conditions under which the conclusion would change. Future sessions searching this directory find the answer instead of re-running the research.
Init a new project
bash /path/to/clybor-claude-tooling/scripts/init.sh /path/to/new-project "My Project Name"
That copies the .claude/ tree, fills the {{PROJECT_NAME}} placeholder in CLAUDE.md, makes hooks executable, and prints next steps.
If you re-run init after updating the template, it overwrites .claude/ files but leaves a project's own CLAUDE.md in place (creates CLAUDE.md.new instead so you can diff).
Not in the template
- Domain governance (database, messaging, storage, calendar) — lives per project
- Document generators (docx, pptx, xlsx) — bundle per project
- Source-handling, research-integrity rules — bundle per project that needs them
Add anything project-specific to that project's own .claude/, not here.
Updating the template
This is git-controlled. Make changes here, commit, then re-run init.sh against any project that should pick up the change.
For long-running projects that have customized rules, prefer copying individual files (cp .claude/agents/quality/chaos-engineer.md /target/.claude/agents/quality/) so you don't clobber project-specific edits.
Layout
clybor-claude-tooling/
├── README.md
├── .claude/
│ ├── agents/
│ │ ├── README.md
│ │ ├── quality/ # adversarial-reviewer, simplifier, chaos-engineer
│ │ ├── code/ # code-reviewer, debugger, code-analyzer, security-auditor
│ │ ├── orchestration/ # 8 orchestration agents
│ │ └── research/ # research-analyst, search-specialist, evidence-auditor, metadata-fetcher
│ ├── skills/
│ │ ├── README.md
│ │ ├── quality-review/SKILL.md
│ │ ├── five-whys/SKILL.md
│ │ ├── writing-quality/SKILL.md
│ │ ├── ralph-loop/SKILL.md
│ │ ├── ralph-implement/SKILL.md
│ │ ├── skill-validator/SKILL.md
│ │ ├── prd-writer/SKILL.md
│ │ ├── task-plan/SKILL.md
│ │ ├── verify/SKILL.md
│ │ ├── insight-crystallizer/SKILL.md
│ │ └── insight-promotion/SKILL.md
│ ├── commands/
│ │ ├── README.md
│ │ ├── quality-review.md
│ │ ├── adversarial.md
│ │ ├── simplify.md
│ │ ├── chaos.md
│ │ ├── five-whys.md
│ │ └── ralph-loop.md
│ ├── hooks/
│ │ ├── README.md
│ │ ├── compact-recovery.sh
│ │ ├── kiss-yagni-reminder.py
│ │ └── ralph-stop.sh
│ ├── rules/
│ │ ├── README.md
│ │ ├── routing-protocol.md
│ │ └── kiss-yagni.md
│ └── settings.json.template
├── scripts/
│ └── init.sh
└── templates/
├── CLAUDE.md.template
├── prd-template.md
└── plan-template.md
Catalog (assets/ + catalog.json)
Sanitized, reusable tooling harvested from real projects — indexed in catalog.json,
PII-gated by scripts/verify-clean.py (pre-commit). To set up a NEW project from the
catalog, invoke the project-bootstrap skill (assets/skills/project-bootstrap/) — it
profiles the project, proposes an install set, copies assets, fills adaptation tokens,
and writes a TOOLING.md manifest. Rebuilds of the catalog itself: scrub withscripts/scrub.py, verify with scripts/verify-clean.py (token mode enforces that
every {{TOKEN}} is declared in catalog.json adaptation_points).
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found