headless-agentic-codebase
Health Uyari
- License — License: MIT
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Low visibility — Only 6 GitHub stars
Code Gecti
- Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Gecti
- Permissions — No dangerous permissions requested
This project provides a drop-in template and orchestration framework for running AI coding agents (like Claude, Gemini, or Codex) continuously and autonomously on a codebase. It handles automated branch management, testing, PR creation, and self-merging based on GitHub issues, all containerized within Docker.
Security Assessment
The tool operates by executing shell commands and interacting directly with GitHub repositories, meaning it inherently modifies code and makes external network requests. However, the architecture is specifically built around physical guardrails to mitigate the risks of unsupervised AI. It runs in isolated Docker containers and restricts the agent's command surface to a controlled Makefile, explicitly preventing destructive actions like force-pushes or unauthorized filesystem changes. No hardcoded secrets were detected in the codebase, and no dangerous permissions are requested. While granting an AI autonomous write access to your code always carries inherent risk, the robust built-in boundaries keep the tool's design risk Low.
Quality Assessment
The project is highly maintained, with its most recent push happening today. It is released under the standard, permissive MIT license. The primary drawback is its extremely low community visibility, having only 6 GitHub stars. Consequently, it lacks widespread peer review and production-tested validation from a broad user base. Despite this low community trust metric, the repository itself is remarkably well-documented, featuring clear governance rules, security scaffolding, and comprehensive setup guides.
Verdict
Use with caution — the framework is thoughtfully designed with strong safety guardrails, but its lack of widespread community adoption means you should audit the configuration files and governance rules thoroughly before trusting it with repository write access.
Run a coding agent (Claude/Gemini/Codex) 24/7 on your project. Container-isolated, self-merging, GitHub-driven workflow. Bootstrap any project from your phone.
Agent Project Pack
A drop-in template for running an autonomous coding agent (Claude Code, Gemini CLI, Codex, etc.) 24/7 on your project.
New here? Start with
GETTING_STARTED.md— a linear walkthrough from "I have an idea" to "agent is shipping code 24/7." Takes 45–90 minutes the first time.This README is the project overview and daily workflow reference. Follow
GETTING_STARTED.mdfirst.
You scope the project, file issues, walk away. The agent branches, plans, writes tests, implements, opens PRs, and self-merges on green CI. You review asynchronously through GitHub — even from your phone — and steer with @agent PR comments.
This repo is the infrastructure that makes that work safely: runtime-agnostic orchestration (Claude, Gemini, Codex), language-agnostic core, container isolation, governance docs, slash commands, GitHub workflow conventions, and a 24/7 launcher loop. Pick a language stack and feature addons (FastAPI, Next.js, mobile, desktop, CLI) only when you need them — the core is the same regardless.
Why this exists
Running an LLM agent against your codebase is easy. Running one unsupervised for hours without it making a mess is the hard part. The usual failure modes:
- Agent goes off on tangents, scope-creeps PRs into unreviewable diffs
- No clear handoff when you want to redirect it mid-task
- Loses context between sessions, re-litigates settled decisions
- No record of what got built or why
- Pushes broken code, force-pushes, deletes history, runs
rm -rf - You can't tell from your phone whether things are going well
This template fixes all of that with three layers of governance:
- Always-loaded instructions (
CLAUDE.md) the agent reads every session - Binding rules for autonomous mode (
docs/unattended-rules.md) covering the work loop, hard limits, and self-audit behaviour - Physical guardrails — Docker container isolation, Makefile-only command surface, CI gate, async GitHub-mediated communication
Plus the conventions that make it actually pleasant: ADRs for architectural decisions, a plain-English progress log, in-progress labels, a per-module documentation system enforced by CI (docs/codebase/), security governance scaffolding (SECURITY.md, CODEOWNERS, threat model), and a BOOTSTRAP_PROMPT.md you can paste into any AI chat to scaffold a new project from your phone.
What you get
.
├── CLAUDE.md # Always-loaded agent context (template)
├── GETTING_STARTED.md # ⭐ START HERE — linear walkthrough
├── BOOTSTRAP_PROMPT.md # Paste into AI chat to scaffold a new project
├── REMOTE_SETUP.md # Phone-only workflow guide
├── STACKS_AND_ADDONS.md # Catalogue of optional language stacks + feature addons
├── SECURITY.md # Vulnerability disclosure policy template
├── Makefile # The only command surface (customise per stack)
├── agent.config # Runtime + model selection
├── agents/ # Per-runtime adapters (claude, gemini, codex, custom)
├── docker/ # Minimal language-agnostic dev container
├── scripts/
│ ├── launch-agent.sh # The 24/7 launcher loop (runtime-agnostic)
│ └── docs-gate.sh # CI gate enforcing docs/codebase/ updates per module
├── stacks/ # OPTIONAL: language toolchains, pick one or more
│ ├── python/
│ ├── node/
│ ├── go/
│ └── rust/
├── addons/ # OPTIONAL: feature scaffolds, drop in as needed
│ ├── fastapi/ # Production backend
│ ├── nextjs/ # Web app
│ ├── mobile-rn/ # React Native cross-platform
│ ├── mobile-native/ # SwiftUI + Kotlin/Compose
│ ├── desktop-tauri/ # Native desktop wrapper
│ ├── cli-tool/ # CLI argument parsing + distribution
│ └── openapi-clients/ # Auto-gen mobile/web clients from API spec
├── .claude/commands/ # Slash commands the agent uses
│ ├── review-prs.md
│ ├── next-issue.md
│ ├── add-adr.md
│ ├── daily-log.md
│ ├── brief-refresh.md
│ ├── phase-status.md
│ └── new-module.md
├── docs/
│ ├── product.md # Product vision (template)
│ ├── architecture.md # Technical architecture (template)
│ ├── phases.md # Build phases (template)
│ ├── unattended-rules.md # The binding rulebook (runtime-agnostic)
│ ├── codebase.md # Per-module documentation index
│ ├── codebase/template.md # Seven-section module doc template
│ ├── decisions/ # ADRs go here
│ └── security/ # Threat model, incident response, supply chain
├── logs/
│ ├── progress.md # Plain-English changelog (agent maintains)
│ └── daily/ # Per-session detailed logs
├── plans/ # Per-deliverable plans (ephemeral)
└── .github/
├── CODEOWNERS # Security-sensitive paths
├── workflows/ci.yml # CI with docs-gate job
├── ISSUE_TEMPLATE/
└── pull_request_template.md
Quick start
Prerequisites
- A Linux/macOS host with Docker and Docker Compose
- GitHub CLI authenticated (
gh auth login) - One of the supported agent CLIs installed and authenticated:
- Claude Code —
npm i -g @anthropic-ai/claude-codethenclaude login(Pro/Max/API) - Gemini CLI —
npm i -g @google/gemini-clithengemini auth(free tier available) - Codex CLI —
npm i -g @openai/codexthencodex login(ChatGPT Plus/Pro/API) - Or wire your own via
agents/custom.sh
- Claude Code —
Setup
# 1. Use this template (or clone)
gh repo create my-project --template <user>/agent-project-pack --private --clone
cd my-project
# 2. Personalise the templates
# Replace {{PROJECT_NAME}} placeholders in CLAUDE.md
# Fill in docs/product.md, docs/architecture.md, docs/phases.md
# (or use BOOTSTRAP_PROMPT.md with any AI chat to do this from your phone)
# 3. Create the agent's labels
for l in "ready-for-agent:0e8a16" "agent-produced:1f77b4" "agent-please-fix:d93f0b" \
"agent-proposed:5319e7" "needs-decision:d93f0b" "in-progress:0075ca" \
"blocked:b60205" "human-only:000000" "human-takeover:000000" \
"tracking:fef2c0" "roadmap:fef2c0" \
"priority:high:b60205" "priority:med:fbca04" "priority:low:c2e0c6"; do
gh label create "${l%:*}" --color "${l##*:}" --force
done
# 4. Customise Makefile test/lint targets for your stack
# (Python: ruff/mypy/pytest; Node: eslint/tsc/jest; Go: vet/test; Rust: clippy/test)
# 5. Build the dev container
make build && make ci
# 6. Seed 5 ready-for-agent issues with clear acceptance criteria
# 7. Launch
make agent-start
The agent runs until you make agent-stop. Walk away.
Knowing what the agent is doing
Three layers of visibility, all readable from the GitHub mobile app:
1. logs/progress.md — plain-English changelog
The agent appends a human-readable entry here every time it ships something. No jargon, no file paths — written for a non-technical reader. Example:
## 2026-04-25 — Photos now get a quality score automatically
**What it does:** When a photo is imported, Frame scores how sharp and well-exposed it is.
**How:** Analyses pixel data mathematically — no AI model needed, runs instantly.
**Why:** Lets users sort by quality and surfaces the keepers automatically.
**Status:** Merged. PR #14.
Read it from the GitHub web/mobile UI — it's just a markdown file in the repo.
2. in-progress label — what's happening right now
When the agent picks up an issue, it adds the in-progress label and comments "Starting work. Branch: agent/14-quality-score." gh issue list --label in-progress (or filter in the mobile app) shows you exactly what's being worked on at any moment.
3. logs/daily/YYYY-MM-DD.md — detailed session log
Per-day technical log: every issue picked up, every PR opened, every CI failure. Useful for debugging when something goes wrong or auditing the agent's decisions later.
Keeping docs honest as code grows
The hardest problem with long-running agent work is docs drifting from code. Over weeks the agent ships features faster than its own context can stay accurate; eventually it starts making decisions based on stale assumptions.
This template solves it with a CI gate. Every PR that changes <source_root>/<module>/**/*.<ext> must also change docs/codebase/<module>.md. The gate runs on every PR; trivial PRs (typo, log string) bypass via the docs-exempt label.
Each per-module doc follows a strict seven-section template (What it does / Public API / Data tables / Pipeline steps / Routes / Configuration / Notes), so the agent always knows where to look up a module without re-reading its source. This is the single most impactful piece of this template — without it, agents lose coherence over time.
Customise scripts/docs-gate.sh and the DOCS_GATE_* env vars in .github/workflows/ci.yml for your source layout.
Daily workflow
# Day one — start the agent and walk away. It runs continuously.
git checkout main && git pull
make agent-start
# Anytime — check what's happening (works from phone via GitHub mobile app)
cat logs/progress.md # Plain-English: features shipped, in plain language
gh pr list --state merged # PRs the agent has shipped
gh issue list --label in-progress # What it's currently working on
make agent-logs # Live tail of today's session
# Steer the agent (works from phone)
# - File issue with `ready-for-agent` label = new task queued
# - Comment "@agent <fix>" + add `agent-please-fix` label = redirect on a PR
# - Add `human-takeover` label = "I'll take this one"
# When you actually want to stop (not at end of day — only when needed)
make agent-stop
The agent is designed to run continuously. Leave it going overnight, weekends, while you travel. It loops every 10 minutes; if the queue is empty it runs a self-audit, proposes new features, and waits. You only stop it when:
- You want to update its core docs (
CLAUDE.md,unattended-rules.md) - You're rebooting the host
- Something has gone wrong and you want to investigate
How it works
The work loop
Every 10 minutes the agent runs one cycle:
- Address any open PR with
@agentcomments oragent-please-fixlabel - Pick the highest-priority
ready-for-agentissue - Mark it
in-progress, branch, write a plan - Tests first, then implementation
- Run
make ciuntil green - Self-merge with squash + delete branch
- Update
logs/progress.md - Loop
Empty queue triggers a /brief-refresh self-audit — agent scans docs and code, files agent-proposed issues with feature ideas and refactoring opportunities, then waits 10 min before the next cycle.
Hard limits
The agent can self-merge, refactor, propose features, and challenge architectural decisions via new ADRs. It cannot:
- Push directly to
main - Force-push or rewrite history
- Read paths matching personal data patterns (configurable per project)
rm -rfoutsideplans/andlogs/- Commit binary user data
- Touch PRs labelled
human-takeover
CI is the gate that protects everything else — failing tests block self-merge.
Steering from anywhere
The agent reads GitHub fresh every cycle, so anything you do via the GitHub web UI or mobile app reaches it within 10 minutes:
| You want to | You do |
|---|---|
| Queue new work | File issue, label ready-for-agent + a priority |
| Fix something on a PR | Comment @agent <instruction>, label agent-please-fix |
| Resolve a blocked decision | Comment your answer, swap needs-decision → ready-for-agent |
| Take over a PR | Add human-takeover label |
| Pause everything | make agent-stop |
Phone-only project bootstrapping
The unique twist: you can scaffold a new project entirely from your phone.
- On phone, GitHub mobile app → "Use this template" → create new repo
- Open any AI chat (Claude, ChatGPT, etc.)
- Paste
BOOTSTRAP_PROMPT.mdcontent - Describe your project in plain English; AI asks focused questions
- AI produces 5 files (
CLAUDE.md,docs/product.md,docs/architecture.md,docs/phases.md,docs/decisions/0001-*.md) plus 5-8 starter issues - Commit each via GitHub mobile's edit view, paste the issues
- Later on a laptop:
make build && make agent-start
See REMOTE_SETUP.md for the full walkthrough.
Configuration
Running multiple projects in parallel on one machine
The Makefile and launcher derive a unique compose project name from the repo directory, so you can run multiple agents on the same host without collisions:
~/code/project-a $ make agent-start # container: project-a-agent-1
~/code/project-b $ make agent-start # container: project-b-agent-1
Each gets its own image (project-a-dev:latest, project-b-dev:latest), container, network, and volumes. They share your host's ~/.claude and ~/.config/gh for auth — that's intentional and fine.
If you also expose HTTP daemons (the optional daemon compose profile), set a different host port per project in each .env:
# project-a/.env
DAEMON_PORT=8765
# project-b/.env
DAEMON_PORT=8766
Override the project name explicitly with PROJECT_NAME=foo make agent-start if your directory names happen to clash.
Choosing the agent runtime
Edit agent.config (or set env vars):
AGENT_RUNTIME=claude # claude | gemini | codex | junie | custom
AGENT_MODEL=default # see agents/<runtime>.sh for valid values
AGENT_IDLE_SLEEP=600 # seconds between cycles when queue is empty
Each runtime has an adapter in agents/<name>.sh that handles model selection, CLI flags, and auth checks. The launcher loop is identical regardless of which runtime runs — same prompt, same governance docs, same workflow.
Switching runtimes is one config change: AGENT_RUNTIME=gemini make agent-start. The Docker image has all CLIs preinstalled (configurable via INSTALL_AGENTS build arg if you want a leaner image).
Model recommendations
| Runtime | Best model | Cheap+fast |
|---|---|---|
| Claude Code | claude-opus-4-7 (default) |
claude-sonnet-4-6 |
| Gemini CLI | gemini-2.5-pro |
gemini-2.5-flash |
| Codex CLI | gpt-5-codex |
gpt-5 |
Override per-launch:
AGENT_MODEL=sonnet make agent-start # Claude with Sonnet
AGENT_RUNTIME=gemini AGENT_MODEL=flash make agent-start
Adding a new runtime
- Copy
agents/custom.shtoagents/<your-runtime>.sh - Implement
run_agent_cycle,check_agent_installed,check_agent_authed - Add CLI install to
docker/Dockerfile.devif needed - Set
AGENT_RUNTIME=<your-runtime>inagent.config
The contract is small — see agents/claude.sh as the reference implementation.
Adapting to your stack
The core of this template is language-agnostic — the launcher, governance docs, slash commands, docs-gate, security suite, and 24/7 loop all work regardless of what language your project uses.
Language toolchains and feature scaffolds are optional add-ons in two directories:
stacks/— language toolchains (Python, Node, Go, Rust). Pick one or more.addons/— feature scaffolds (FastAPI backend, Next.js web, React Native mobile, Tauri desktop, OpenAPI clients, CLI tools). Drop in only what you need.
See STACKS_AND_ADDONS.md for the full catalogue and apply instructions.
Common combinations:
| Project type | Pick |
|---|---|
| Backend + admin UI | python + node + fastapi + nextjs |
| Mobile-first SaaS | python + node + fastapi + mobile-rn + openapi-clients |
| CLI tool | go (or rust) + cli-tool |
| Photo/video app (premium feel) | python + fastapi + mobile-native + desktop-tauri + openapi-clients |
| Static-typed web service | rust only |
You're not locked in — add an addon later when you need it. Remove one by deleting the directory and reverting its CLAUDE.md block.
The four files customised per project regardless of stack:
docker/Dockerfile.dev— language toolchainMakefile—test,lint,formatcommands.github/workflows/ci.yml— CI matching your Makefilepyproject.toml/package.json/ equivalent — project manifest
Everything else (slash commands, governance docs, launcher) stays as-is.
What this is not
- Not an AI orchestration framework. This is configuration + conventions, not code. There's no daemon, no scheduler beyond a bash
while/sleeploop. You can read every file in 30 minutes. - Not a Claude wrapper. It uses the official Claude Code CLI in non-interactive mode. Switching to a different agent runner is a one-file change.
- Not for production deployment of agents. This runs one agent against one repo on your machine. It's a developer tool, not a multi-tenant platform.
- Not magic. If your acceptance criteria are vague, the agent's PRs will be vague. Garbage in, garbage out applies.
Safety: private vs public project repos
The agent only acts on issues labelled ready-for-agent. In a private repo, only you can apply that label, so you're safe by default.
In a public repo, anyone can open issues. The label is still your gate, but you need a workflow that auto-strips ready-for-agent and agent-please-fix from any contribution that isn't yours, otherwise external contributors could queue work directly. See GETTING_STARTED.md for the workflow YAML and the full checklist (don't store production secrets in the agent container, monitor logs in week one, etc.).
The template repo itself (this one) is fine to keep public — there's no agent loop running on it; it's just files.
Cost controls
The agent runs on your subscription or API quota. Three built-in layers protect you from runaway costs:
- Daily cost cap —
AGENT_MAX_DAILY_USD=5inagent.config. Launcher pauses for an hour when reached. - Daily PR cap —
AGENT_MAX_PRS_PER_DAY=20for runaway loops. - Per-PR cost comments — every merged PR gets
Cost: $0.42 (approx)posted automatically.
Plus make agent-cost shows today's spend at a glance. See GETTING_STARTED.md for recommended values per subscription type and how to inspect spend.
The agent also can't auto-merge changes to its own controls (Makefile, launcher, agent.config, unattended-rules.md, workflows). It labels those PRs human-only-merge and leaves them for you to review.
Comparison
| This template | Devin / similar SaaS | Plain agent CLI | |
|---|---|---|---|
| Runs locally | Yes | No | Yes |
| Self-hosted, your data | Yes | No | Yes |
| 24/7 unattended | Yes | Yes | No (manual) |
| Container isolation | Yes | Yes | No by default |
| Phone-driven steering | Yes (via GitHub) | Limited | No |
| Self-audit + feature proposals | Yes | Varies | No |
| Customisable rules | Yes (just markdown) | No | Manually each session |
| Swap underlying model/agent | Yes — config change | No | No |
| Cost | Your subscription | $$$ subscription | Your subscription |
| Lock-in | None | High | Tied to one vendor |
Roadmap
This is a personal template I use myself, but I'll merge useful PRs. Things that would help:
- Pre-built
Dockerfile.devvariants for common stacks (Rust, Go, Elixir) - A second launcher mode that wakes only on GitHub webhook (cheaper than the 10-min loop)
- A
make doctorthat diagnoses common setup issues - Translations of
unattended-rules.mdif non-English-language agents become useful
Contributing
PRs welcome. Keep additions minimal and generic — anything project-specific belongs in your fork, not here.
Hard rule: don't add anything that requires a paid third-party service to use the template. The point is something you can run with just Docker, GitHub, and a Claude subscription.
License
MIT. See LICENSE.
Credits
Built while using Claude Code to build something else, then extracted as a template once it became clear the workflow itself was the more valuable artefact. The unattended rules and slash commands are the result of repeated iteration against a real project — not theoretical.
If you build something interesting with this, open an issue and tell me. I want to know what works and what doesn't.
Yorumlar (0)
Yorum birakmak icin giris yap.
Yorum birakSonuc bulunamadi