Forge

agent
Security Audit
Pass
Health Pass
  • License — License: MIT
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Community trust — 21 GitHub stars
Code Pass
  • Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Pass
  • Permissions — No dangerous permissions requested

No AI report is available for this listing yet.

SUMMARY

From raw idea to shipped MVP with a team of specialist AI agents inside Claude Code. Zero install, just Markdown.

README.md

🔨 Forge

From raw idea to shipped MVP — with a team of specialist AI agents inside Claude Code.

License: MIT
Built for Claude Code
PRs Welcome
CI
Stars

Describe what you want. Forge's orchestrator decomposes it, routes each subtask to the right specialist (Product, Designer, Architect, Analyst, Security, iOS, Frontend, Backend, ML/CV, DevOps, QA, Docs), reviews every output, and hands you a working MVP.

QuickstartHow it worksThe agentsExamplesFAQ


✨ Why Forge

  • 🧠 One generalist prompt ≠ a team. Forge gives you twelve deep specialists instead of one shallow everything-agent.
  • 🗂️ Shared memory that survives handoffs. Every agent reads project_context/ first — nothing gets lost between steps.
  • 🛡️ QA is mandatory, not optional. No output ships until reviewed against your project's explicit conventions.
  • ⌨️ The loop is one keystroke away. /forge-status, /forge-task, /forge-qa, /forge-ship, /forge-retro — the whole workflow ships as slash commands.
  • 🔒 Guardrails checked in. Shared .claude/settings.json denies agents access to .env, keys, and secrets; a session hook auto-loads task status so no session starts blind.
  • 🔌 Zero install. No framework, no runtime — just Markdown prompts and a folder convention. Drop it into any repo.
  • 🧩 Fork-friendly. Every agent is one file. Swap, tweak, or add specialists in minutes. AGENTS.md makes it portable to Cursor, Codex & friends.

🧭 How it works

The big picture

flowchart LR
    You([💡 You: an idea])
    You --> O{{🎛️ Orchestrator}}

    O --> P[📋 Product]
    P -->|MVP spec| O

    O --> DS[🎨 Designer]
    DS -->|Design system| O

    O --> A[🏛️ Architect]
    A -->|Stack + interfaces| O

    O --> AN[🧪 Analyst]
    AN -->|Event schema| O

    O --> SEC[🔒 Security]
    SEC -->|Threat model| O

    O --> iOS[📱 iOS]
    O --> FE[🌐 Frontend]
    O --> BE[⚙️ Backend]
    O --> ML[🧠 ML/CV]

    iOS --> QA
    FE --> QA
    BE --> QA
    ML --> QA

    QA{{🛡️ QA}} -->|pass| O
    QA -->|fail + feedback| O

    O --> DO[🚀 DevOps]
    O --> D[📖 Docs]
    DO --> Done([✅ Shipped MVP])
    D --> Done

    classDef phase fill:#1a1a1a,stroke:#555,color:#fff
    classDef agent fill:#2d4a2b,stroke:#4caf50,color:#fff
    classDef review fill:#5a3a1a,stroke:#ff9800,color:#fff
    class O,QA,SEC review
    class P,DS,A,AN,iOS,FE,BE,ML,DO,D agent

Why it works: shared context

Every agent reads from the same shared brain before starting. That's why handoffs don't leak information.

flowchart TB
    subgraph ctx["🗂️ project_context/ — shared memory"]
        PRD[PRODUCT.md<br/>what & for whom]
        DSG[DESIGN.md<br/>tokens & flows]
        ARCH[ARCHITECTURE.md<br/>stack & structure]
        CONV[CONVENTIONS.md<br/>style & patterns]
        INT[INTERFACES.md<br/>module contracts]
        ANA[ANALYTICS.md<br/>events & KPIs]
        SECM[SECURITY.md<br/>threat model]
        ERR[ERRORS_LOG.md<br/>past failures]
        PRG[PROGRESS.md<br/>task status]
    end

    subgraph agents["🤖 Specialist agents"]
        direction LR
        P[Product]
        DS[Designer]
        A[Architect]
        AN[Analyst]
        SEC[Security]
        SPEC[iOS • Frontend<br/>Backend • ML/CV]
        Q[QA]
        DOC[Docs]
    end

    P -.writes.-> PRD
    DS -.writes.-> DSG
    A -.writes.-> ARCH & CONV & INT
    AN -.writes.-> ANA
    SEC -.writes.-> SECM
    Q -.writes.-> ERR
    SPEC -.reads.-> ctx
    DOC -.reads.-> ctx
    Q -.reads.-> ctx

    classDef ctxFile fill:#0d2b4a,stroke:#2196f3,color:#fff
    classDef agentNode fill:#2d4a2b,stroke:#4caf50,color:#fff
    class PRD,DSG,ARCH,CONV,INT,ANA,SECM,ERR,PRG ctxFile
    class P,DS,A,AN,SEC,SPEC,Q,DOC agentNode

What happens to a single task

sequenceDiagram
    participant U as 👤 You
    participant O as 🎛️ Orchestrator
    participant S as 🤖 Specialist
    participant Q as 🛡️ QA
    participant E as 📝 ERRORS_LOG

    U->>O: "Build me X"
    O->>O: Decompose into atomic subtasks
    loop for each subtask
        O->>S: TASK + context files
        S->>O: Result + self-assessment
        O->>Q: Review against CONVENTIONS.md
        alt QA passes
            Q-->>O: ✅ PASS
        else QA fails (up to 3×)
            Q-->>O: ❌ FAIL + specific feedback
            Q->>E: Log failure + fix
            O->>S: Retry with feedback
        end
    end
    O->>U: Delivered MVP + summary

⚡ Quickstart

Prerequisites: Claude Code installed and working.

# 1. Clone Forge
git clone https://github.com/lionshilov/Forge.git
cd forge

# 2. Bootstrap a new project (copies agents + context into a fresh dir)
./scripts/forge-init.sh ~/code/my-app "My App"

# 3. Open it in Claude Code
cd ~/code/my-app
claude

Then in Claude Code, just describe what you want:

"Build me a small iOS app that reminds me to drink water based on how active I've been today."

Claude auto-loads the root CLAUDE.md and takes on the Orchestrator role. From here, you mostly answer product questions and approve QA verdicts — you're not writing boilerplate.

📖 Full walkthrough: examples/quickstart.md

Starter templates

Skip the scaffolding and start with a working stack:

./scripts/forge-init.sh ~/code/my-tool "My Tool" --template vanilla-static
./scripts/forge-init.sh ~/code/my-web  "My Web"  --template nextjs-supabase
./scripts/forge-init.sh ~/code/my-ios  "My iOS" --template swiftui-ios
./scripts/forge-init.sh ~/code/my-api  "My API" --template fastapi-postgres
Template What you get
vanilla-static Single index.html, dark theme, GitHub Pages workflow — zero build step
nextjs-supabase Next.js 15 App Router + Supabase (SSR), Tailwind, TypeScript strict
swiftui-ios SwiftUI MVVM drop-in (@Observable), APIClient, Theme tokens
fastapi-postgres Async FastAPI + SQLAlchemy 2.0 + Postgres (docker-compose), pytest, ruff

🤖 The agents

Agent Role When it runs
📋 Product Idea → MVP spec (user stories, scope, metrics) First. Always.
🎨 Designer Design system, flows, tokens, a11y floor After Product, before any UI is built
🏛️ Architect Tech stack, directory structure, ADRs, interfaces After Product, before any code
🧪 Analyst Event schema, KPIs, funnels, A/B tests After Architect, before instrumentation lands
🔒 Security Threat model, auth, secrets, OWASP review After Architect, re-reviews before ship
📱 iOS / Swift SwiftUI, UIKit, CoreML, HealthKit, async/await Parallel with other specialists
🌐 Frontend-Web React, Next.js, Vue, Svelte, Tailwind, a11y Parallel with other specialists
⚙️ Backend REST/GraphQL, Postgres/Redis, auth, Docker Parallel with other specialists
🧠 ML / CV PyTorch, CoreML, real-time inference pipelines When models are part of the product
🛡️ QA Reviews every output, logs failures, enforces conventions After every specialist output
🚀 DevOps CI/CD, Fastlane, TestFlight, Docker deploy Once there's working code + Security sign-off
📖 Docs README, API docs, changelog, diagrams Last — after QA approves

Every agent is one Markdown file under agents/<name>/CLAUDE.md. Read it, tweak it, PR it.

Dispatch: role-switch vs subagent

All 12 agents are registered as Claude Code subagents in .claude/agents/; the Orchestrator picks the mode per task:

  • role-switch (default for Product, Designer, Architect, Analyst, Security, Docs) — Orchestrator reads the agent's prompt into the current context. Cheap, visible, and keeps the live dialog: Product can ask you clarifying questions directly.
  • subagent (default for iOS, Frontend-Web, Backend, ML/CV, QA, DevOps) — Orchestrator calls Claude Code's Task tool with subagent_type: "<name>", registered in .claude/agents/<name>.md. Each subagent runs in an isolated context with scoped tools, which means:
    • Heavy code generation doesn't clog the main thread
    • Frontend + Backend can run in parallel
    • QA is read-only by design — it literally cannot patch the code under review
    • DevOps tool scope is constrained separately from Backend

The strategy agents can also run as subagents when no live dialog is needed (the brief is complete, or several strategy tasks run in parallel). Since an isolated subagent can't ask you anything, their registrations enforce one rule: return open questions instead of inventing answers — the Orchestrator relays them and re-dispatches.

Both modes read from the same project_context/ — handoffs are still consistent.

The loop as slash commands

Every phase of the loop is invocable directly — no need to re-explain the workflow to Claude:

Command What it does
/forge-status Task board + recent failures + git reality check; recommends the next action
/forge-task <desc> Decomposes a feature into atomic subtasks, records them in PROGRESS.md, dispatches
/forge-qa [scope] Independent review via the read-only qa subagent; enforces the pass/fail loop
/forge-ship Pre-ship gate: Security re-review → DevOps → Docs
/forge-retro Distills session lessons into ERRORS_LOG.md / CONVENTIONS.md — the system learns

Guardrails & continuity, checked in

.claude/settings.json ships with the project (and into everything forge-init.sh bootstraps):

  • Secret hygiene — agents are denied Read access to .env, .env.local, *.pem, *.key, secrets/**. (.env.example stays readable on purpose.)
  • Session continuity — a SessionStart hook auto-loads the top of project_context/PROGRESS.md, so a fresh session resumes where the last one stopped instead of starting blind.

🗂️ Project layout

forge/
├── CLAUDE.md                    ← Orchestrator prompt (auto-loaded by Claude Code)
├── AGENTS.md                    ← Same system for other AI tools (agents.md standard)
├── README.md
├── LICENSE                      ← MIT
├── CONTRIBUTING.md
├── scripts/
│   └── forge-init.sh            ← Bootstrap a new project
├── project_context/             ← Shared memory templates
│   ├── PRODUCT.md
│   ├── DESIGN.md
│   ├── ARCHITECTURE.md
│   ├── CONVENTIONS.md
│   ├── INTERFACES.md
│   ├── ANALYTICS.md
│   ├── SECURITY.md
│   ├── ERRORS_LOG.md
│   └── PROGRESS.md
├── .claude/
│   ├── settings.json            ← Shared guardrails: secret deny-rules + session hook
│   ├── skills/                  ← Slash commands: /forge-status, -task, -qa, -ship, -retro
│   └── agents/                  ← All 12 agents registered as Claude Code subagents
│       ├── product.md …         ← strategy agents (model: inherit) — role-switch
│       │                          by default, subagent when no dialog needed
│       ├── ios-swift.md …       ← implementation agents (model: sonnet) —
│       │                          isolated context via Task tool
│       └── qa.md                ← read-only by design (CI enforces it)
├── agents/                      ← One folder per specialist (full prompts)
│   ├── product/
│   ├── designer/
│   ├── architect/
│   ├── analyst/
│   ├── security/
│   ├── ios-swift/
│   ├── frontend-web/
│   ├── backend/
│   ├── ml-cv/
│   ├── devops/
│   ├── qa/
│   └── docs/
├── examples/
│   └── quickstart.md
└── templates/                   ← Stack-specific starters (coming)

🎬 What using Forge actually feels like

You're mostly making decisions, not typing code:

You:       "An app that reminds me to drink water based on activity."

Orchestrator → Product:    ✍️  Drafts PRODUCT.md, asks you 3 questions
You:                       Answer the questions
Product:                   PRODUCT.md finalized

Orchestrator → Architect:  ✍️  Chooses SwiftUI + HealthKit, writes ARCHITECTURE.md
Orchestrator → Architect:  ✍️  Defines INTERFACES.md (HealthKit reads, notification schedule)

Orchestrator → iOS:        🔨  Implements hydration tracking screen
Orchestrator → QA:         🔍  Flags: missing accessibility labels, line 42
Orchestrator → iOS:        🔨  Fixes issues
Orchestrator → QA:         ✅  Pass

Orchestrator → DevOps:     🚀  Adds Fastlane lane for TestFlight
Orchestrator → Docs:       📖  Writes README

You:                       ⌘B in Xcode — app builds, runs, works.

The compounding trick: every agent's output is input for the next, and ERRORS_LOG.md means the system learns across runs.


🎯 When to use Forge (and when not to)

✅ Great fit

  • Solo founders / indie devs going 0 → 1 on a new product
  • Hackathons — full stack in a weekend without burning out
  • Internal tools — a team of one needs to ship a dashboard Monday
  • Side projects where you'd otherwise procrastinate on scaffolding

⚠️ Probably not worth it

  • Tiny scripts (< 100 LOC) — just ask Claude directly
  • Large legacy codebases — the value is in clean-slate scaffolding
  • Hard-deadline production systems — you still need human code review for security/compliance

🛠️ Customizing

Tweak an agent's behavior

Edit agents/<name>/CLAUDE.md. The Anti-Patterns section has the highest leverage — every line there is a class of bug that won't happen again.

Add a new specialist

mkdir agents/android-kotlin
$EDITOR agents/android-kotlin/CLAUDE.md
# Follow the format in any existing agent

Then add a row to the table in root CLAUDE.md so the Orchestrator knows about it. See CONTRIBUTING.md.

Enforce your house style

Fill in project_context/CONVENTIONS.md before running Forge on a real project. QA will enforce every rule you put there.


❓ FAQ

Is this a framework? A CLI? What do I actually install?

Nothing. Forge is Markdown prompts and a folder convention. The runtime is Claude Code. forge-init.sh is a 30-line bash script that copies files.

Does this work with Cursor / Windsurf / other AI IDEs?

Yes, in role-switch mode. Forge ships an AGENTS.md (the agents.md open standard read by Cursor, Codex, Zed and others) that tells any AI agent how to run the system: adopt the Orchestrator role from CLAUDE.md, dispatch by reading agents/<name>/CLAUDE.md into context, keep project_context/ honest. Claude Code additionally gets isolated subagents, slash commands, and hooks — those are runtime-specific.

How does the Orchestrator actually "delegate" if it's all one Claude session?

Two modes, picked per task:

  1. Role-switch (default for Product, Designer, Architect, Analyst, Security, Docs) — Orchestrator reads agents/<name>/CLAUDE.md into the current context and continues in that role. One session, visible in the main thread.
  2. Subagent (default for iOS, Frontend, Backend, ML, QA, DevOps) — Orchestrator calls Claude Code's Task tool with the matching subagent_type. Subagents are registered in .claude/agents/<name>.md with YAML frontmatter (name, tools, model). They run in isolated contexts with scoped tools and return a summary. QA is locked to read-only this way — it cannot silently modify the code it's reviewing.

Both modes read the same project_context/*.md files, so handoffs stay consistent regardless of which mode was used.

Why Mermaid diagrams in the README?

GitHub renders them natively, they version-control cleanly, and they stay in sync with the prompts. No PNGs to rot.

Can I use this commercially?

Yes — MIT license. Use it, fork it, ship with it. Stars and PRs appreciated but not required.

What if QA keeps failing the same task?

After 3 retries the Orchestrator escalates to you. Usually it means PRODUCT.md or INTERFACES.md is too vague — tighten them and the agent unblocks.


🤝 Contributing

We'd love your help — especially:

  • 🆕 New agents (Android, Game dev, Rust/systems, SRE, Growth/ASO, Data Engineering)
  • 📦 Templates for common stacks (Next.js + Supabase, FastAPI + Postgres, etc.)
  • 🐛 Anti-pattern PRs — found a failure mode? Add it to an agent's blocklist.
  • 📖 Examples — real projects built with Forge

Read CONTRIBUTING.md first. Issues are open for discussion.


📜 License

MIT — use it, fork it, ship with it.


If Forge helps you ship something, star the repo ⭐ — it's the single biggest way to help the project grow.

Made for builders who'd rather decide than type.

Reviews (0)

No results found