foundry

agent
Security Audit
Fail
Health Warn
  • License — License: MIT
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Low visibility — Only 6 GitHub stars
Code Fail
  • process.env — Environment variable access in .github/workflows/codex-review.yml
  • rm -rf — Recursive force deletion command in check-agents.sh
  • process.env — Environment variable access in ci-templates/codex-review.yml
Permissions Pass
  • Permissions — No dangerous permissions requested

No AI report is available for this listing yet.

SUMMARY

Multi-agent code factory. GitHub Issues that write their own code. Claude Code + Codex + Gemini with 3 AI reviewers per PR.

README.md

Foundry — GitHub Issues that write their own code

Foundry

Multi-agent code factory. GitHub Issues that write their own code.
One label. Three AI reviewers. Zero human keystrokes.
Running on a 2019 MacBook Pro for $400/month.

Quick StartHow It WorksReview LoopOpenClaw + ACPReal Numbers


The Pitch

Add a foundry label to a GitHub Issue. Go to sleep. Wake up to a pull request with three AI code reviews, all fixes applied, CI green.

GitHub Issue → PR in one command

No prompts. No terminals. No babysitting. The issue body IS the spec. The agent figures out the rest.


Real Numbers

Early production use across 6 private repos (numbers from internal testing, updated periodically):

Metric Value
Tasks spawned 47
Merged successfully 42 (89%)
Average time to merge 4.2 hours
Cost per task $2-8
Avg review-fix cycles 3.7
Required human help 5 (11%)

Most failures trace back to vague specs, not agent limitations. Fix the spec, re-run, it works.

Hardware: 2019 MacBook Pro 16" (Intel i9, 64GB RAM). Two Claude Max subscriptions ($200/mo each), Codex, Gemini. ~$400/month total.


How It Works

Foundry Architecture

The Full Chain

GitHub Issue (with `foundry` label)
    ↓
Foundry Orchestrator reads the issue body as a spec
    ↓
Routes to best agent (Claude Code / Codex / Gemini)
    ↓
Creates git worktree + branch
    ↓
Spawns agent with the spec as context
    ↓
Agent writes code, opens PR (with `fixes #N` in body)
    ↓
3 AI reviewers review independently
    ↓
Agent reads ALL reviews, pushes fixes in one cycle
    ↓
CI passes + all reviewers approve = ready to merge
    ↓
PR merges → Issue auto-closes

Spawning an Agent

foundry orchestrate in action

Agent Routing

Not every agent is good at everything:

  • Claude Code: Frontend, React, complex refactors, nuanced review feedback
  • Codex: Backend, APIs, infrastructure, bulk changes (the workhorse)
  • Gemini: Design systems, documentation, config files, creative structure

The router is a grep. When it picks wrong, you override with a hint in the issue.

# Keyword routing (yes, really)
"frontend|react|component|css|ui"  → Claude Code
"api|backend|database|migration"   → Codex
"design|theme|token|style"         → Gemini

The Review Loop

Three AI Reviewers

Every PR gets reviewed by three independent AI reviewers:

  1. Claude (Opus 4.6) — Principal Engineer review. Architecture, security, correctness, maintainability. Blocking.
  2. Codex — architecture, API contracts, test coverage. Creates blocking status check.
  3. Gemini — design patterns, naming, documentation. Advisory.

Critical: all three must report before the agent starts fixing. This prevents wasted cycles where fixing one reviewer's feedback breaks another's.

Budget: 20 fix cycles per attempt. 5 attempts per task. If it can't converge, it notifies you.


The Dashboard

foundry status dashboard

foundry status shows all active tasks with BACKEND and CRKGS columns:

Letter Gate
C CI green
R Claude approved
K Codex approved
G Gemini approved
S Branch synced

A task showing CRKGS is ready to merge.


The Respawn Engine

Agents crash. Rate limits. Token expiry. OOM kills. Network timeouts.

Foundry checks every 30 minutes:

  1. Agent alive? If not → respawn with same spec, branch, PR
  2. New reviews? → trigger fix cycle
  3. CI failed? → mark for investigation
  4. Budget exhausted? → archive, notify you

Most tasks complete on the first attempt. Persistent failures get escalated to you via Telegram.


Visual Evidence: Agents That Prove Their Work

For PRs with frontend changes, Foundry expects visual proof. Screenshots, videos, before/after comparisons. If the PR body has no images and the diff touches .tsx/.jsx/.vue/.css, Foundry flags it.

Telegram Topics: One Thread Per Agent

Every foundry spawn --topic creates a dedicated Telegram forum topic for that task. All status updates — CI results, review verdicts, respawns, merges — go to that thread. Your main chat gets a one-liner ("spawned TASK-123 — tracking in topic") and stays clean.

# Spawn with a new topic (auto-created)
foundry spawn my-org/my-repo specs/backlog/add-auth.md claude --topic

# Reuse an existing topic
foundry spawn my-org/my-repo specs/backlog/add-auth.md claude --topic-id 4821

Under the hood, tg_notify_task checks the SQLite registry for a tg_topic_id. If one exists, the message goes to that thread. If not, it falls back to your main chat. Zero config changes needed for existing tasks.

Requirements:

  • Telegram supergroup with Topics enabled (group settings → Topics → On)
  • Your bot must be an admin in the group
  • Set TG_CHAT_ID to the supergroup ID and OPENCLAW_TG_BOT_TOKEN to your bot token

OpenClaw Integration

ACP Protocol Flow

What is ACP?

ACP (Agent Client Protocol) is like LSP (Language Server Protocol), but for AI coding agents. One standardized protocol that any agent can speak, any orchestrator can dispatch to.

OpenClaw as Orchestrator

OpenClaw is an AI gateway that speaks ACP natively. It turns Foundry from "scripts on a laptop" into "managed agent fleet you control from your phone":

# Spawn via your orchestrator agent (e.g. from Telegram, Slack, or CLI)
foundry spawn my-org/my-repo "Build the tracking integration per issue #6" claude --topic

What OpenClaw Adds

Push-based notifications: No more cron polling. When an agent finishes, OpenClaw pushes a notification to Telegram, Slack, Discord, or email. Instantly.

Remote control from your phone: Merge PRs, respawn agents, check status, all via Telegram message. You don't need to be at your laptop.

ACP Adapters: Each agent (Claude, Codex, Gemini) has an adapter that translates its native CLI into ACP. When a new agent drops, write one adapter. Instantly compatible.

Horizontal scaling: Run agents across multiple machines. Your Mac at home, a cloud instance, a colleague's server. OpenClaw distributes work based on capacity.

The notification chain:

Agent finishes work
    ↓ ACP result event
OpenClaw receives completion
    ↓ routes to your channel
📱 "PR #47 ready. CI green. 3 reviews pending."
    ↓ you tap "merge"
Done.

The Full Stack: Paperclip → OpenClaw → Foundry

Foundry is the execution layer in a three-tier autonomous development stack:

Paperclip (CEO/PM layer)
  Creates issues, assigns priorities, tracks progress
    ↓ wake event
OpenClaw (orchestrator layer)
  Receives wake, loads agent context, routes by label
    ↓ foundry spawn / foundry orchestrate
Foundry (execution layer)
  Spawns coding agents, manages worktrees, runs review loop
    ↓ PR with fixes
GitHub (delivery layer)
  CI, code review, merge, issue auto-close
    ↓ status update
Paperclip (closes the loop)
  Agent reports PR link back to the original issue

Paperclip acts as the product management layer. Its CEO agent creates prioritized issues with labels. OpenClaw wakes on those events and routes them through Foundry based on label:

  • engineering / foundryfoundry spawn (isolated worktree, own branch + PR)
  • ops → handled directly by the orchestrator agent (config, crons, research)

The full loop has been verified end-to-end: CEO creates issue → OpenClaw agent wakes → creates GitHub issue → spawns Foundry agent → agent codes + opens PR → reviews pass → agent reports PR link back to Paperclip. Zero human keystrokes.

Without OpenClaw

Foundry works perfectly standalone. Cron jobs + local agents + GitHub. OpenClaw is the upgrade path when you want remote control, push notifications, and multi-machine scaling. Add Paperclip on top when you want autonomous project management.



Quick Start

# Install (or update)
curl -fsSL https://raw.githubusercontent.com/merlinrabens/foundry/main/install.sh | bash

# Configure repos, agents, notifications
foundry setup

# Go
foundry status                     # Dashboard
foundry scan ~/projects/my-repo    # Find labeled issues
foundry orchestrate                # Full auto: scan → spawn → check

That's it. The installer handles cloning, PATH, prerequisites, and database setup. The setup wizard walks you through everything else.

Requirements

  • macOS or Linux
  • GitHub CLI (gh) — authenticated (gh auth login)
  • At least one AI agent (sign in via each CLI for OAuth, or set API key as fallback):
    • Claude Code: npm i -g @anthropic-ai/claude-code + claude /login
    • Codex: npm i -g @openai/codex + run codex to sign in
    • Gemini: npm i -g @google/gemini-cli + run gemini to sign in
  • SQLite3, jq (installer checks for these)

Setup Guide

foundry setup handles everything interactively:

  1. Repos — which repos Foundry should manage
  2. AGENTS.md — generates config file in repos that need one
  3. Agents — detects installed CLIs, checks OAuth sign-in status
  4. Telegram — optional notifications (bot token + chat ID)
  5. CI workflows — deploys review workflows to your repos
  6. Database — creates the SQLite registry

GitHub Repository Secrets

Each repo that uses Foundry's CI review workflows needs these secrets (Settings > Secrets > Actions):

Secret Required Used By
CLAUDE_CODE_OAUTH_TOKEN Yes Claude Code Review (CI)
OPENAI_API_KEY If using Codex review Codex Review (CI)
TELEGRAM_BOT_TOKEN Optional Notifications
TELEGRAM_CHAT_ID Optional Notifications

Uninstall

curl -fsSL https://raw.githubusercontent.com/merlinrabens/foundry/main/install.sh | bash -s -- --uninstall

Commands

foundry setup                         Interactive config wizard
foundry status                        Dashboard of all tasks
foundry scan <repo-path>              Find `foundry`-labeled issues
foundry spawn <repo> <spec> [agent] [--topic]   Spawn agent (optionally with TG topic)
foundry check [task-id]               Monitor agents, trigger reviews
foundry respawn <task-id>             Retry a failed task
foundry orchestrate [repo]            Full auto: scan + spawn + check
foundry cleanup                       Archive completed tasks
foundry steer <task-id> <msg>         Redirect agent mid-flight via ACP
foundry ask <task-id> <question>     Ask agent a question, get reply
foundry diagnose <task-id>            Debug a stuck task
foundry peek <task-id>                Structured JSON status (registry + live)
foundry nudge <task-id>               Unstick a stalled agent
foundry design <repo> <spec> [agent]  Gemini-first design pipeline
foundry recommend <spec>              Suggest best agent for task
foundry update                        Self-update from upstream

Self-Hosted Runner (Event-Driven, The Real Killer)

By default, Foundry uses cron to poll for changes. But there's a faster path: a self-hosted GitHub Actions runner on your machine.

When any CI workflow finishes, any review is submitted, or a PR is closed, GitHub triggers a lightweight foundry-gate.yml workflow that runs foundry check <task-id> locally on your machine. Not in GitHub's cloud. On the same Mac where your agents live.

Result: Foundry reacts in seconds, not in 30 minutes when the cron fires. Event-driven. No polling. $0/month.

# .github/workflows/foundry-gate.yml (included in ci-templates/)
on:
  workflow_run:
    workflows: ["Tests & Lint", "Claude Code Review", "Codex Code Review", "Gemini Review Check"]
    types: [completed]
  pull_request_review:
    types: [submitted]

runs-on: [self-hosted, foundry]  # ← your machine, not GitHub cloud
steps:
  - run: foundry check "$TASK_ID"

Setup (5 minutes)

# Install self-hosted runner for your org (covers all repos)
bash scripts/setup-runner.sh your-org

# Or for a single repo
bash scripts/setup-runner.sh your-org/your-repo

# Deploy the foundry-gate.yml workflow to your repos
bash ci-templates/deploy-ci.sh your-org/your-repo

The cron jobs below are the fallback for edge cases. The self-hosted runner handles 95% of events in real-time.


Cron Setup (Fallback)

# Check loop: monitor agents, trigger reviews (every 30 min)
2,32 * * * *  cd ~/.foundry && bash foundry check

# Orchestrator: scan for new issues, spawn agents (every 3 hours)
5 */3 * * *   cd ~/.foundry && bash foundry orchestrate

# Cleanup: archive completed tasks (3 AM)
0 3 * * *     cd ~/.foundry && bash foundry cleanup

Configuration

Foundry uses a two-file config system:

File Tracked? Purpose
config.env Yes (in git) Safe defaults, no secrets, no personal data
config.local.env No (gitignored) Your overrides: repos, Telegram IDs, personal paths

The dispatcher sources config.env first, then config.local.env on top. Any variable in config.local.env wins.

Important: Bash arrays like KNOWN_PROJECTS=() are replaced, not merged. Your config.local.env must contain the full list of projects, not just additions.

# config.env — safe defaults (committed)
DEFAULT_MODEL=codex        # Default backend (codex, claude, gemini)
ENABLED_BACKENDS=codex,claude,gemini  # Available backends
MAX_RETRIES=5              # Spawn attempts per task
MAX_REVIEW_FIXES=20        # Review-fix cycles per attempt
MAX_CONCURRENT=4           # Parallel agents
AGENT_TIMEOUT=1800         # 30 min per agent run
AUTO_MERGE_LOW_RISK=false  # Auto-merge docs/tests PRs
KNOWN_PROJECTS=()          # Empty — set in config.local.env

# config.local.env — your overrides (gitignored, never committed)
DEFAULT_MODEL=claude                  # Override default backend
ENABLED_BACKENDS=claude,gemini        # Disable codex (e.g., rate limited)
TG_CHAT_ID="-100xxxxxxxxxx"
KNOWN_PROJECTS=(
  "$HOME/projects/my-org/my-repo"
  "$HOME/projects/my-org/another-repo"
)

foundry setup creates config.local.env for you during initial configuration.

Architecture

foundry (CLI entry point)
├── commands/           # User-facing commands
│   ├── spawn.bash      # Agent spawning + worktree setup
│   ├── check.bash      # Health monitoring + review triggers
│   ├── respawn.bash    # Failure recovery + context injection
│   ├── orchestrate.bash # Full automation loop
│   ├── lifecycle.bash  # attach, logs, kill, steer, open
│   └── status.bash     # Dashboard
├── core/               # Shared infrastructure
│   ├── registry_sqlite.bash  # SQLite state management
│   ├── gh.bash               # GitHub API (with retry)
│   └── logging.bash          # Structured logging + Telegram
├── lib/                # Business logic
│   ├── acp_orchestrator.py   # ACP protocol handler (all backends)
│   ├── runner_script.bash    # Agent runner generator
│   ├── session_bridge.bash   # OpenClaw native session bridge
│   ├── jerry_routing.bash    # Smart agent selection
│   ├── review_pipeline.bash  # 3-reviewer orchestration
│   ├── spawn_guards.bash     # Pre-spawn validation
│   ├── respawn_helpers.bash  # Failure context gathering
│   └── state_machine.bash    # Task lifecycle
├── ci-templates/       # GitHub Actions workflows
│   ├── claude-code-review.yml
│   ├── codex-review.yml
│   ├── gemini-check.yml
│   └── foundry-gate.yml      # Event-driven bridge
├── openclaw/           # OpenClaw skill (installed during setup)
│   └── SKILL.md
└── tests/              # 396 tests (bats)

License

MIT

Author

Merlin Rabens — AI Systems Architect

Built this because I was spending 6 hours a day being a human webhook between AI agents and GitHub. Now I add a label and go read bedtime stories.


The gap between "I use AI coding agents" and "AI coding agents work for me while I sleep" is surprisingly small.
It's a cron job, a label, and the willingness to close your laptop.

Reviews (0)

No results found