batty

agent
Guvenlik Denetimi
Uyari
Health Uyari
  • License — License: MIT
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Low visibility — Only 5 GitHub stars
Code Gecti
  • Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Gecti
  • Permissions — No dangerous permissions requested
Purpose
This tool is a control plane for AI coding teams. It lets you define roles like architect, manager, and engineers, running them concurrently via a typed JSON protocol and isolated git worktrees while using tmux for session visibility.

Security Assessment
As an agent orchestration tool, its primary function is executing shell commands and managing subprocesses via tmux and git worktrees. The automated code scan (12 files reviewed) found no dangerous code patterns, no hardcoded secrets, and the tool does not request inherently dangerous system permissions. However, because it acts as a supervisor that launches and routes commands to other AI agents (like Claude Code or Codex), it inherently controls developer environments. Given the execution context, the overall risk is rated as Medium.

Quality Assessment
The project is highly transparent and well-structured. It is actively maintained (last push was today), includes a clear README with CI/CD badges, and is distributed under the permissive MIT license. The primary drawback is its extremely low community visibility; it currently only has 5 GitHub stars, meaning it has not yet undergone widespread peer review or battle-testing by the broader developer community.

Verdict
Use with caution — the code is clean and actively maintained, but low community adoption means you should review its agent configurations before letting it execute commands in your environment.
SUMMARY

Supervised agent execution for software teams. Kanban-driven, tmux-native, test-gated.

README.md

Batty

Batty

Hierarchical agent teams for software development.

Define a team of AI agents in YAML. Batty runs them through structured SDK protocols or a PTY-based shim, routes work and messages between roles, manages engineer worktrees, and keeps the team moving while tmux remains the display and persistence layer.

CI crates.io Release MIT License

Quick Start · Docs · GitHub


Batty is a control plane for AI coding teams. Instead of one agent doing everything badly, you define roles like architect, manager, and engineers; Batty launches each agent through typed SDK protocols (the default) or a PTY-owning shim fallback, isolates engineer work in git worktrees, routes messages, tracks the board, and uses tmux for visibility and session persistence.

Quick Start

cargo install batty-cli
cd my-project && batty init
batty start --attach
batty send architect "Build a REST API with JWT auth"

That gets you from zero to a live team session. For the full walkthrough, templates, and configuration details, see the Getting Started guide.

Quick Demo

Watch the Batty demo

Watch the full demo on YouTube

You
  |
  | batty send architect "Build a chess engine"
  v
Architect (Claude Code)
  | plans the approach
  v
Manager (Claude Code)
  | creates tasks, assigns work
  v
Engineers (Codex / Claude / Aider)
  eng-1-1   eng-1-2   eng-1-3   eng-1-4
   |          |          |          |
   +---- isolated git worktrees ----+

Batty keeps each role visible in its own tmux pane. In SDK mode (the default since v0.7.x), each agent communicates over a typed JSON protocol -- Claude Code via stream-json NDJSON, Codex via JSONL, Kiro via ACP JSON-RPC 2.0 -- giving structured message delivery, completion detection, and auto-approval of tool use. When SDK mode is off, the PTY-owning shim with screen classification provides a universal fallback. The daemon auto-dispatches board tasks, runs standups, and merges engineer branches back when they pass tests.

For unattended teams, leave auto_respawn_on_crash: true enabled. Turning it
off is mainly useful when you want to debug crashes manually or supervise pane
restarts yourself.

Install

1. Install kanban-md

kanban-md is a separate Go tool. Grab the latest binary from
GitHub releases:

# macOS (Apple Silicon)
curl -sL https://github.com/antopolskiy/kanban-md/releases/latest/download/kanban-md_0.33.0_darwin_arm64.tar.gz | tar xz
mv kanban-md /usr/local/bin/

# macOS (Intel)
curl -sL https://github.com/antopolskiy/kanban-md/releases/latest/download/kanban-md_0.33.0_darwin_amd64.tar.gz | tar xz
mv kanban-md /usr/local/bin/

# Linux (x86_64)
curl -sL https://github.com/antopolskiy/kanban-md/releases/latest/download/kanban-md_0.33.0_linux_amd64.tar.gz | tar xz
mv kanban-md ~/.local/bin/

Or with Go: go install github.com/antopolskiy/kanban-md@latest

2. Install Batty

From crates.io:

cargo install batty-cli

From source:

git clone https://github.com/battysh/batty.git
cd batty
cargo install --path .

How It Works

team.yaml
   |
   v
batty start
   |
   +--> SDK mode (default): typed JSON protocol per agent backend
   |      Claude Code  -> stream-json NDJSON
   |      Codex CLI    -> JSONL (exec --json)
   |      Kiro CLI     -> ACP JSON-RPC 2.0
   +--> PTY fallback: shim process with screen classifier (use_sdk_mode: false)
   +--> tmux panes for operator visibility
   +--> engineer worktrees created when enabled
   +--> daemon loop watches agent state, inboxes, board, retries, standups
   |
   v
batty send / assign / board / status / merge

Batty does not embed a model. It orchestrates external agent CLIs, keeps state in files, uses SDK protocols (or PTY shims) as the execution boundary, and uses tmux plus git worktrees as the operator-facing runtime surface.

On restart, Batty resumes saved agent sessions when the launch identity still
matches and the saved session is still available. If the saved session is stale
or missing, Batty falls back to a cold respawn and rebuilds task context
automatically. Healthy live panes are left alone; startup preflight only
respawns panes that are already dead.

Built-in Templates

batty init --template <name> scaffolds a ready-to-run team:

Template Agents Description
solo 1 Single engineer, no hierarchy
pair 2 Architect + 1 engineer
simple 6 Human + architect + manager + 3 engineers
squad 7 Architect + manager + 5 engineers
large 19 Human + architect + 3 managers + 15 engineers
research 10 PI + 3 sub-leads + 6 researchers
software 11 Human + tech lead + 2 eng managers + 8 developers
batty 6 Batty's own self-development team

Highlights

  • Hierarchical agent teams instead of one overloaded coding agent
  • SDK mode (default): structured JSON protocols for Claude Code (stream-json NDJSON), Codex CLI (JSONL), and Kiro CLI (ACP JSON-RPC 2.0) with completion detection and auto-approval
  • PTY shim fallback with screen classification and state detection when SDK mode is off
  • tmux-backed visibility with persistent panes and session resume
  • Agent-agnostic role assignment: Claude Code, Codex, Aider, Kiro, or similar — set the default with batty init --agent <backend>
  • Maildir inbox routing with explicit talks_to communication rules
  • Stable per-engineer worktrees with fresh task branches on each assignment
  • Kanban-driven task loop with auto-dispatch, retry tracking, and test gating
  • Scheduled tasks: scheduled_for delays dispatch until a future time, cron_schedule enables recurring tasks that auto-recycle from done back to todo (guide)
  • Intervention system: seven automated recovery mechanisms (triage, review, owned-task, dispatch-gap, utilization, board replenishment, idle nudge) with cooldowns, dedup, and escalation
  • Per-intervention runtime toggles via batty nudge to disable or re-enable specific daemon behaviors without restarting
  • Orchestrator automation for triage, review, owned-task recovery, dispatch-gap recovery, utilization recovery, standups, nudges, and retrospectives
  • Auto-merge policy engine with confidence scoring and configurable thresholds for safe unattended merges
  • Review timeout escalation: stale reviews are nudged and auto-escalated after configurable thresholds, with per-priority overrides
  • Failure pattern detection: rolling window analysis detects recurring failures and notifies when thresholds are exceeded
  • SQLite telemetry database: batty telemetry queries agent performance, task lifecycle, review pipeline metrics, and event history
  • Consolidated metrics dashboard: batty metrics shows tasks, cycle time, rates, and agent performance in one view
  • Run retrospectives: batty retro generates Markdown reports analyzing task throughput, review stall durations, rework rates, and failure patterns
  • Team template export/import: batty export-template saves your team config, batty init --from restores it
  • Bundled Grafana dashboard template with 21 panels and 6 alerts for monitoring agent sessions, pipeline health, and task lifecycle
  • Daemon restart recovery: dead agent panes are automatically respawned with task context and backoff
  • Crash auto-respawn defaults to on for unattended teams; disable it only for debugging or manual supervision
  • External senders: allow non-team sources (email routers, Slack bridges) to message any role
  • Graceful non-git-repo handling: git-dependent operations degrade cleanly when the project is not a repository
  • Session summary on batty stop: prints task counts, cycle times, and agent uptime before exiting
  • Daemon auto-archive: completed tasks are automatically archived when the board exceeds a threshold
  • Board health dashboard: batty board health shows per-status counts, stale tasks, and dependency issues
  • Team load and cost estimation: batty load shows team utilization, batty cost estimates session spending
  • Inbox purge with age filtering: batty inbox purge --older-than 7d cleans up delivered messages
  • batty validate --show-checks: individual pass/fail status for each config validation rule
  • batty doctor --fix: detect and clean up orphan worktrees and branches left by previous runs
  • batty board archive: move completed tasks to an archive directory to keep the active board fast
  • Error resilience: sentinel tests guard production error paths in daemon and task loop modules
  • Modular codebase: large modules (daemon, config, delivery, watcher, doctor, merge) are decomposed into focused submodules
  • Worktree reconciliation: auto-detect cherry-picked branches and reset stale worktrees so engineers always start clean
  • Pending delivery queue: messages sent to agents that are still starting are buffered and delivered automatically when the agent becomes ready
  • YAML config, Markdown boards, JSON/JSONL + SQLite logs: everything stays file-based

CLI Quick Reference

Command Purpose
batty init [--template NAME] [--agent BACKEND] Scaffold .batty/team_config/
batty start [--attach] Launch the daemon and tmux session
batty stop / batty attach Stop or reattach to the team session
batty send <role> <message> Send a message to a role
batty assign <engineer> <task> Queue work for an engineer and report delivery result
batty inbox <member> / read / ack Inspect and manage inbox messages
batty board / board list / board summary Open the kanban board or inspect it without a TTY
batty board health Show board health dashboard (status counts, stale tasks, dep issues)
batty board archive [--older-than DATE] Move done tasks to archive directory
batty status [--json] Show current team state
batty merge <engineer> Merge an engineer worktree branch
batty review <id> <disposition> [feedback] Record a review disposition (approve, request-changes, reject)
batty task review <id> --disposition <d> Record a review disposition (workflow-level variant)
batty task schedule <id> [--at T] [--cron E] [--clear] Set or clear scheduled dispatch time and cron recurrence
batty nudge disable/enable/status Toggle specific daemon interventions at runtime
batty telemetry summary/agents/tasks/reviews/events Query SQLite telemetry for agent, task, and review metrics
batty retro Generate a run retrospective analyzing throughput and failure patterns
batty load Estimate team load and show recent load history
batty cost Estimate current run cost from agent session files
batty metrics Show consolidated telemetry dashboard (tasks, cycle time, rates, agents)
batty doctor [--fix] Dump diagnostic state; --fix cleans up orphan worktrees/branches
batty pause / resume / queue Control automation and inspect queued dispatch work
batty inbox purge [--older-than DUR] Purge delivered inbox messages, optionally by age
batty validate [--show-checks] Validate config; --show-checks shows per-rule pass/fail
batty config / export-run Show resolved config and export runtime state
batty telegram Configure Telegram for human communication
batty completions <shell> Generate shell completions

Requirements

  • Rust toolchain, stable >= 1.85
  • tmux >= 3.1 (recommended >= 3.2)
  • kanban-md CLI: see Install for setup
  • At least one coding agent CLI such as Claude Code, Codex, or Aider

Engineer Worktrees

When use_worktrees: true is enabled for engineers, Batty keeps one stable
worktree directory per engineer under .batty/worktrees/<engineer>.

Each new batty assign does not create a new worktree. Instead it:

  • reuses that engineer's existing worktree path
  • resets the engineer slot onto current main
  • creates a fresh task branch such as eng-1-2/task-123 or
    eng-1-2/task-say-hello-1633ae2d
  • launches the engineer in that branch

After merge, Batty resets the engineer back to the base branch
eng-main/<engineer> so the next assignment starts clean.

Telegram Integration

Batty can expose a human endpoint over Telegram through a user role. This is
useful when you want the team to keep running in tmux while you send direction
or receive updates from your phone.

The fastest path is:

batty init --template simple
batty telegram
batty stop && batty start

batty telegram guides you through:

  • creating or reusing a bot token from @BotFather
  • discovering your numeric Telegram user ID
  • sending a verification message
  • updating .batty/team_config/team.yaml with the Telegram channel config

After setup, the user role in team.yaml will look like this:

- name: human
  role_type: user
  channel: telegram
  talks_to: [architect]
  channel_config:
    provider: telegram
    target: "123456789"
    bot_token: "<telegram-bot-token>"
    allowed_user_ids: [123456789]

Notes:

  • You must DM the bot first in Telegram before it can send you messages.
  • bot_token can also come from BATTY_TELEGRAM_BOT_TOKEN instead of being
    stored in team.yaml.
  • The built-in simple, large, software, and batty templates already
    include a Telegram-ready user role.

Built with Batty

Batty team session in tmux

This session shows Batty coordinating a live team in ~/mafia_solver: the architect sets direction, black-lead and red-lead turn that into lane-specific work, and the black-eng-* / red-eng-* panes are individual engineer agents running in separate worktrees inside one shared tmux layout.

  • chess_test: a chess engine built by a Batty team (architect + manager + engineers)

Grafana Monitoring

Batty includes a bundled Grafana dashboard template with 21 panels across 6 rows and 6 pre-configured alerts. The dashboard covers session overview, pipeline health, agent performance, delivery and communication, task lifecycle, and recent activity.

The dashboard JSON is available in the source tree at src/team/grafana/dashboard.json. Copy it and import into your Grafana instance to monitor live team runs.

Pre-configured alerts:

Alert Detects
Agent Stall Agent silent past threshold
Delivery Failure Spike Message delivery failures climbing
Pipeline Starvation Not enough work in the pipeline
High Failure Rate Tasks failing above threshold
Context Exhaustion Agent context window nearly full
Session Idle Entire team idle too long

Docs and Links

License

MIT

Yorumlar (0)

Sonuc bulunamadi