forge

agent
Security Audit
Warn
Health Warn
  • License — License: MIT
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Low visibility — Only 5 GitHub stars
Code Pass
  • Code scan — Scanned 3 files during light audit, no dangerous patterns found
Permissions Pass
  • Permissions — No dangerous permissions requested
Purpose
This tool provides a structured, multi-phase workflow engine designed for Claude Code. It forces AI models to follow strict software engineering practices—such as automated testing, bug fixing, and crash recovery—when generating projects.

Security Assessment
The overall risk is rated as Low. The light code scan of its 3 shell scripts found no dangerous patterns and no hardcoded secrets. It does not request any dangerous system permissions. However, because the repository is written entirely in Shell, it acts as an execution wrapper for AI-generated commands. While the tool itself is safe, standard precautions should always be taken when letting any AI agent execute code locally.

Quality Assessment
The project has a clean baseline for trust. It is actively maintained, with repository updates pushed as recently as today, and operates under the permissive and standard MIT license. The documentation is highly detailed and translated into multiple languages. The only significant drawback is its extremely low community visibility. It currently has only 5 GitHub stars, meaning it has not yet been widely peer-reviewed or battle-tested by a large user base.

Verdict
Use with caution — the underlying code is safe and licensed, but its lack of community adoption means it should be evaluated in a controlled environment before relying on it for production work.
SUMMARY

Forge — a solid harness-engineered workflow for Claude Code

README.md

English | 中文 | 日本語 | Español

Forge

Forge — a solid harness-engineered workflow for Claude Code. Quality enforced by structure, not prompt discipline.

License GitHub stars

┌─────────────────────────────────────────────────────────┐
│  $ /forge:forge Build a rate-limited API gateway with    │
│            JWT auth, Redis token revocation, request     │
│            dedup, and Prometheus metrics                 │
│                                                         │
│  🟦 Planning...      ✅ Plan complete                   │
│  🟩 Wave 1/3 Dev...  ✅ 12 unit tests passed            │
│  🟨 Wave 1/3 Test... ✅ PASS (unit: 12/12, int: 3/4)   │
│  🟩 Wave 2/3 Dev...  ✅ 8 unit tests passed             │
│  🟨 Wave 2/3 Test... ❌ 2 bugs → 🟩 Fix → ✅ PASS      │
│  🟩 Wave 3/3 Dev...  ✅ 6 unit tests passed             │
│  🟨 Wave 3/3 Test... ✅ PASS                            │
│  🟨 Integration...   ✅ Full integration PASS            │
│  🟪 Learning...      4 new lessons → knowledge.md       │
│                                                         │
│  Forge completed! 26/26 unit tests passed.              │
└─────────────────────────────────────────────────────────┘

Vanilla Claude Code Hurts — Forge Fixes It Structurally

You've been there:

  • Code without tests — or tests that exist only on paper. If humans skip tests, models will too
  • Context drift — halfway through a task, it forgets what it started with
  • Fix A, break B — no closed-loop verification, bugs cascade
  • Same pitfalls, every run — zero learning from past mistakes
  • Crash = start over — no checkpoint, no recovery, all work gone

Forge doesn't ask the model to "be careful." It welds engineering discipline into the workflow:

Mechanism How It Works
Closed-Loop Dev→Test→Fix Dev must pass Test. Fails? Fix. Fails again? Fix again, up to 3 rounds. Verdict: PASS/FAIL — no "looks fine"
Wave-Based Scaling Large requirements auto-split into Waves. Each wave gets a fresh Dev+Test pair. Cross-wave handoff via handoff files
Experience Accumulation Learner extracts lessons into knowledge.md. Planner references them next time. Smarter with every run
Crash Recovery Every step writes a checkpoint to state.json. Re-run the same command, resume from where you left off. Zero lines lost
Think Before Code Planner spots ambiguities and asks you first. No more coding in the wrong direction for an hour
Context Isolation Coordinator tracks paths and status only — never reads intermediate content. Context stays lean

Architecture

User ──"/forge requirement"──▶ Coordinator
                                  │
                       ┌──────────┤
                       ▼          │
                  🟦 Planner     │
                  plan + waves   │
                       │          │
                       ▼          │
               ┌─── Wave Loop ──────────────────┐
               │                                │
               │  🟩 Dev W1 ──▶ 🟨 Test W1     │
               │       ▲              │         │
               │       │           FAIL?        │
               │       └── Fix ──────┘          │
               │       (same agent,             │
               │        context preserved)      │
               │              │                 │
               │            PASS ──▶ next wave  │
               │                                │
               └────────────────────────────────┘
                                  │
                                  ▼
                       🟨 Full Integration Test
                                  │
                                  ▼
                       🟪 Learner ──▶ 📚 knowledge.md
Agent Model Role
Coordinator Dispatches agents, tracks paths and status — never reads intermediate content
Planner sonnet Decomposes requirements, defines acceptance criteria, surfaces ambiguities
Dev sonnet Implements features, writes unit tests, fixes bugs — only what's in the plan
Test haiku Unit tests must pass, integration tests best-effort, bug reports precise
Learner haiku Extracts lessons, deduplicates, max 3 per run

Design Philosophy: Harness Engineering

Quality through structural constraints, not model self-discipline:

  • Structure > Willpower — Rules live in the workflow, not in prompts. If a rule can't be structurally enforced, redesign the workflow
  • Closed Loop > Open Loop — Dev→Test→Fix is mandatory. Verdicts are PASS/FAIL only, no "should be fine"
  • Isolation > Bloat — Coordinator tracks paths and status, never content. Each wave gets independent context
  • Recoverable > Retryable — State checkpoint at every step. Resume from breakpoint after crash, don't start over
  • Only What's Asked — Every changed line traces back to the plan or a bug fix. No extras, no speculative enhancements

5-Minute Quick Start

Prerequisite: Claude Code CLI installed.

Plugin install (recommended):

/plugin marketplace add zjio26/forge
/plugin install forge

Offline install:

git clone https://github.com/zjio26/forge.git && cd forge && bash install.sh

Fire:

/forge:forge Build a rate-limited API gateway with JWT auth, Redis token revocation, request dedup, and Prometheus metrics

Plugin mode uses /forge:forge, manual install uses /forge. That's the only difference.


Usage Examples

Build a non-trivial feature:

You: /forge:forge Implement a full user registration→login→order→payment flow with JWT auth, inventory locking, and timeout auto-release

Forge: Planner decomposes into 5 subtasks, 3 waves → Wave 1 builds data models and auth → Wave 2 handles orders and inventory → Wave 3 handles payments and timeouts → full integration test → Learner extracts 3 lessons

Refactor with a safety net:

You: /forge:forge Refactor the database layer to use connection pooling, compatible with all existing callers

Forge: Planner identifies affected modules → Dev makes surgical changes → Test runs full unit + integration suite → regression caught by auto-Fix → zero manual intervention

Crashed? Resume:

You: /forge:forge (just re-run the same command)

Forge: Reads state.json → resumes from last checkpoint → not a single line lost


Project Structure

forge/
├── agents/                  # Specialized agent definitions
│   ├── planner.md           # Requirement decomposition — sonnet, defines acceptance criteria
│   ├── dev.md               # Implementation + unit tests + bug fixes — sonnet, only planned work
│   ├── test.md              # Test verification — haiku, unit test fail = FAIL
│   └── learner.md           # Experience extraction — haiku, dedup, max 5 per run
├── skills/forge/
│   ├── SKILL.md             # Coordinator orchestrator — tracks paths and status only
│   └── knowledge.md         # Experience knowledge base — auto-updated by Learner
├── install.sh               # Offline installer
└── CLAUDE.md                # Project instructions

Runtime artifacts (in target project's .forge/ directory, gitignored):

.forge/
├── {slug}-plan.md              # Development plan
├── {slug}-waves.json           # Wave grouping
├── {slug}-dev-W{n}.md          # Wave dev record
├── {slug}-test-W{n}.md         # Wave test report
├── {slug}-handoff-W{n}.md      # Wave handoff file
├── {slug}-test-integration.md  # Full integration test report
├── {slug}-state.json           # State checkpoint (crash recovery)
└── {slug}-metrics.json         # Runtime metrics

Contributing & License

PRs welcome. Agent definitions and orchestration logic are the core — read CLAUDE.md's design principles before modifying.

  • Modifying agent definitions: each file must be self-contained, path references use .forge/
  • Modifying SKILL.md: Coordinator tracks paths and status only, never reads content
  • Modifying install.sh: keep source-to-target path mappings in sync

MIT License

Reviews (0)

No results found