SWE-Squad
Health Gecti
- License — License: MIT
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Community trust — 11 GitHub stars
Code Basarisiz
- rm -rf — Recursive force deletion command in .claude/settings.json
- process.env — Environment variable access in .pi/extensions/swe-cost.ts
Permissions Gecti
- Permissions — No dangerous permissions requested
This project provides an autonomous, always-on AI engineering manager that reads GitHub issues, triages them, writes and reviews code fixes, and merges pull requests. It acts as a persistent daemon connecting a TypeScript control plane with a Python agent library.
Security Assessment
Overall Risk: Medium. Because the tool is designed to autonomously read issues, write code, and merge changes, it naturally handles sensitive data (requiring extensive environment variable access for APIs and databases). The rule-based scan flagged a recursive force deletion command (`rm -rf`) inside the Claude settings. While potentially dangerous if misconfigured, this is likely used for automated workspace cleanup. The tool does not request inherently dangerous operating system permissions, and no hardcoded secrets were detected. However, its core function of executing autonomous code changes requires extensive network access and high-level repository privileges.
Quality Assessment
The project is active, with its last code push occurring today. It operates under the standard, permissive MIT license. Community trust is currently very low; the repository has only 11 GitHub stars, indicating minimal public scrutiny or widespread adoption. Additionally, while the README claims an impressive test suite of over 6,800 tests, the project remains too new and untested by the broader community to be considered highly reliable.
Verdict
Use with caution — the autonomous execution capabilities, recent creation, and low community adoption require strict sandboxing and careful human oversight before deploying in production environments.
Autonomous Software Engineering Agents — self-healing, self-diagnosing development team powered by Claude Code and A2A protocol
SWE Squad
Autonomous Software Engineering Agents That Fix Bugs While You Sleep
An always-on AI engineering manager backed by a persistent LLM session with 16 custom tools.
Scans GitHub issues, investigates root causes, delegates fixes, reviews PRs, and enforces safety gates — autonomously.
Built on pi-agent SDK • Claude Code • Supabase • A2A Protocol
Overview
SWE Squad is an always-on AI engineering manager that runs as a persistent daemon. It:
- Imports GitHub issues as structured tickets into a Supabase store
- Triages by severity — the LLM decides priority, not hardcoded rules
- Investigates root causes by delegating to any configured coding engine
- Develops fixes on feature branches with automated test verification
- Reviews PRs with structured feedback (security, correctness, style)
- Merges approved changes and monitors for regressions
- Notifies via Telegram on critical events, PR creation, and failures
The system is built on two codebases:
| Layer | Language | Purpose |
|---|---|---|
| Control Plane | TypeScript | Persistent pi-agent daemon with 16 custom tools — the decision-making brain |
| Agent Library | Python | Specialized agents (monitor, triage, investigate, develop), ticket store, embeddings |
Key Capabilities
- 16 Custom Tools — ticket CRUD, GitHub import, investigation/development/review delegation, PR management, workspace provisioning, safety gates, health monitoring, notifications
- Engine-Agnostic Delegation — swap coding engines (Claude CLI, Gemini CLI, Copilot, OpenCode) via config
- Provider-Agnostic Architecture — every external service is a swappable plugin behind an interface
- Persistent Sessions — JSONL-backed session state survives daemon restarts
- Safety Gates — circuit breaker, stability gate, outcome tracker, budget enforcement
- Semantic Memory — pgvector embeddings surface similar past fixes at investigation time
- Multi-Team Support — multiple squads share Supabase without overlap
- React WebUI — management dashboard with Kanban boards, pipeline editor, team controls
Architecture
The V2 architecture centers on a single persistent LLM session (via @mariozechner/pi-coding-agent) that decides what to do based on its persona and tool results. No hardcoded phases.
flowchart TD
subgraph daemon [" SWE-Manager Daemon (TypeScript) "]
Session["pi-agent Session\nPersistent LLM + 16 tools"]
HB["Heartbeat Loop\n5-min interval"]
HB -->|"prompt"| Session
end
subgraph tools [" Custom Tools "]
direction LR
TL["ticket_list\nticket_create\nticket_update"]
GH["github_issues\ngithub_import"]
DEL["delegate_investigation\ndelegate_development\ndelegate_review"]
PR["run_tests\napprove_pr\nmerge_pr"]
OPS["check_stability\ncheck_health\ncheck_metrics"]
WS["manage_workspace\nsend_notification"]
end
subgraph engines [" Coding Engines (config-resolved) "]
Claude["Claude Code CLI"]
Gemini["Gemini CLI"]
Copilot["GitHub Copilot"]
end
subgraph infra [" Infrastructure "]
Supa[("Supabase\nTickets + pgvector")]
GitHub["GitHub API\nIssues + PRs"]
Telegram["Telegram\nNotifications"]
end
Session --> tools
DEL -->|"spawn"| engines
TL & GH --> Supa
GH --> GitHub
WS --> Telegram
classDef daemonNode fill:#6366f1,stroke:#4338ca,color:#fff,stroke-width:2px,rx:12
classDef toolNode fill:#3b82f6,stroke:#2563eb,color:#fff,stroke-width:1.5px
classDef engineNode fill:#ef4444,stroke:#dc2626,color:#fff,stroke-width:1.5px
classDef infraNode fill:#10b981,stroke:#059669,color:#fff,stroke-width:2px
classDef subgraphBox fill:transparent,stroke:#e5e7eb,stroke-width:1px,color:#6b7280
class Session,HB daemonNode
class TL,GH,DEL,PR,OPS,WS toolNode
class Claude,Gemini,Copilot engineNode
class Supa,GitHub,Telegram infraNode
class daemon,tools,engines,infra subgraphBox
Ticket Pipeline
The daemon flushes right-to-left, completing nearest-done work first:
open → investigating → investigation_complete → in_development → in_review → testing → resolved
Each heartbeat, the LLM picks the highest-priority ticket closest to completion and advances it one step.
How the Fix Loop Works
flowchart TD
Start(["New Ticket"]):::startNode --> Cache{"Trajectory\ncache hit?"}:::decisionNode
Cache -->|"hit — free"| Replay["Replay cached fix\nzero cost"]:::cacheNode
Replay --> Tests0{"Tests\npass?"}:::testNode
Tests0 -->|"pass"| Keep0(["KEEP — commit"]):::successNode
Cache -->|"miss"| A1
subgraph attempts [" Escalating Fix Attempts "]
A1["Attempt 1 — Sonnet\nRoutine fix"]:::sonnetNode
A1 --> Tests1{"Tests\npass?"}:::testNode
Tests1 -->|"pass"| Keep1(["KEEP"]):::successNode
Tests1 -->|"fail"| A2["Attempt 2 — Sonnet\n+ error context"]:::sonnetNode
A2 --> Tests2{"Tests\npass?"}:::testNode
Tests2 -->|"pass"| Keep2(["KEEP"]):::successNode
Tests2 -->|"fail"| A3["Attempt 3 — Opus\nOrchestrates sub-agents"]:::opusNode
A3 --> Tests3{"Tests\npass?"}:::testNode
Tests3 -->|"pass"| Keep3(["KEEP"]):::successNode
Tests3 -->|"fail"| HITL
end
HITL(["HITL Escalation\nTelegram notification"]):::failNode
Tests0 -->|"fail"| A1
classDef startNode fill:#6366f1,stroke:#4338ca,color:#fff,stroke-width:2px
classDef decisionNode fill:#f59e0b,stroke:#d97706,color:#fff,stroke-width:2px
classDef cacheNode fill:#8b5cf6,stroke:#7c3aed,color:#fff,stroke-width:1.5px
classDef testNode fill:#64748b,stroke:#475569,color:#fff,stroke-width:1.5px
classDef sonnetNode fill:#3b82f6,stroke:#2563eb,color:#fff,stroke-width:1.5px
classDef opusNode fill:#ef4444,stroke:#dc2626,color:#fff,stroke-width:2px
classDef successNode fill:#10b981,stroke:#059669,color:#fff,stroke-width:2px
classDef failNode fill:#ef4444,stroke:#dc2626,color:#fff,stroke-width:2px
classDef subgraphBox fill:transparent,stroke:#e5e7eb,stroke-width:1px,color:#6b7280
class attempts subgraphBox
Each attempt runs on a git branch. Tests pass = commit + PR. Tests fail = git reset --hard (auto-revert). No broken code ever reaches main.
Quick Start
Prerequisites
- Node.js 20+ and pnpm (for the TypeScript control plane)
- Python 3.10+ (for the agent library and tests)
- Claude Code CLI (coding engine)
- GitHub CLI (
gh) authenticated
1. Install
git clone https://github.com/ArtemisAI/SWE-Squad.git
cd SWE-Squad
# TypeScript control plane
cd control-plane && pnpm install && cd ..
# Python agent library
pip install python-dotenv pyyaml
2. Configure
cp .env.example .env
# Edit .env with your credentials (see Configuration section)
3. Run the Daemon
# Single heartbeat (test your setup)
npx tsx control-plane/src/main.ts --verbose
# Daemon mode (continuous 5-minute heartbeats)
npx tsx control-plane/src/main.ts --daemon --verbose
# Fresh session (discards prior session state)
npx tsx control-plane/src/main.ts --daemon --fresh --verbose
# Dry run (validates config and tool registration, no LLM calls)
npx tsx control-plane/src/main.ts --dry-run
4. Run Tests
# Python tests (5900+ tests)
python3 -m pytest tests/ -v --tb=short
# TypeScript tests (900+ tests)
cd control-plane && pnpm test
# TypeScript type checking
cd control-plane && pnpm typecheck
The 16 Custom Tools
The daemon's LLM session has access to these tools, registered via defineTool() from pi-agent:
| Tool | Purpose |
|---|---|
ticket_list |
Query tickets by status, severity, repo, or pipeline view |
ticket_create |
Create a new ticket with fingerprint-based deduplication |
ticket_update |
Update ticket status, notes, assignee; enforces resolution audit |
github_issues |
List open GitHub issues from configured repositories |
github_import |
Import GitHub issues as tickets with dedup (fingerprint: gh-issue-{repo}-{number}) |
delegate_investigation |
Claim ticket, resolve engine from config, spawn investigation, store report |
delegate_development |
Claim ticket, provision workspace, spawn development, create PR |
delegate_review |
Spawn code review on a PR with structured feedback |
run_tests |
Execute test suite in a workspace and report results |
approve_pr |
Approve a pull request via GitHub API |
merge_pr |
Merge an approved PR (squash merge) |
manage_workspace |
Create/cleanup/list git worktrees for isolated development |
check_stability |
Evaluate safety gates: circuit breaker + open criticals + test failures |
check_health |
Aggregate health snapshot: Supabase, engines, circuit breaker, uptime |
check_metrics |
Pipeline metrics: throughput, cycle time, failure rates |
send_notification |
Send alerts via configured provider (Telegram, Slack, webhook) |
Configuration
Environment Variables
Copy .env.example to .env and configure:
| Variable | Required | Description |
|---|---|---|
SWE_TEAM_ENABLED |
Yes | Kill switch (true/false) |
SWE_TEAM_ID |
Yes | Unique team identifier for ticket scoping |
SWE_GITHUB_ACCOUNT |
Yes | Dedicated GitHub bot account |
GH_TOKEN |
Yes | GitHub PAT with repo scope |
SUPABASE_URL |
Yes | Supabase PostgREST URL |
SUPABASE_ANON_KEY |
Yes | Supabase authentication key |
TELEGRAM_BOT_TOKEN |
No | Telegram bot token for notifications |
TELEGRAM_CHAT_ID |
No | Telegram chat ID for alerts |
BASE_LLM_API_URL |
No | OpenAI-compatible proxy for embeddings |
ANTHROPIC_BASE_URL |
No | Proxy URL for Claude CLI (engine delegation) |
SWE_DAEMON_MODEL |
No | Override daemon LLM model (default: claude-sonnet) |
SWE_MODEL_T2 |
No | Override delegation model tier (default: sonnet) |
See .env.example for the full list.
YAML Config (config/swe_team.yaml)
The YAML config controls:
delegation— per-role engine binding (investigator, developer, reviewer)workspace— worktree provisioning settingsdaemon— heartbeat interval, initial prompt, session lifecyclecycle— max concurrent investigations/developments, severity filtersmemory— embedding model, similarity thresholds, TTLnotification— provider selection (telegram/slack/webhook)governance— stability gate thresholdsgithubRepos— list of repos to scan for issues
Engine Delegation
The daemon never implements directly. It delegates to configured coding engines resolved from config:
# config/swe_team.yaml
delegation:
investigator:
engine: claude-cli
model: sonnet
readOnly: true
timeout: 1800
developer:
engine: claude-cli
model: sonnet
timeout: 3600
reviewer:
engine: claude-cli
model: haiku
readOnly: true
timeout: 900
Supported engines: Claude Code CLI, Gemini CLI, OpenCode, GitHub Copilot. Adding a new engine = new file in providers/engine/ + config entry.
Model Routing
| Scenario | Model | Cost |
|---|---|---|
| Daemon management cycle | Sonnet | $$ |
| Investigation (default) | Sonnet | $$ |
| Development + PR creation | Sonnet | $$ |
| PR review | Haiku | $ |
| Embeddings, fact extraction | bge-m3 / gemini-3-flash | $ |
| CRITICAL bugs | Opus | $$$ |
| Deterministic replay (cached) | None | Free |
flowchart LR
Ticket(["Incoming Ticket"]):::startNode --> Cached{"Cached\nfix?"}:::decisionNode
Cached -->|"hit — free"| Replay(["Replay\nzero cost"]):::cacheNode
Cached -->|"miss"| Severity{"Severity?"}:::decisionNode
subgraph tiers [" Model Tiers "]
direction TB
T1["T1 Haiku\nEmbeddings, triage\n$"]:::t1Node
T2["T2 Sonnet\nInvestigation + fix\n$$"]:::t2Node
T3["T3 Opus\nOrchestrator only\n$$$"]:::t3Node
end
Severity -->|"LOW / MEDIUM"| T1
Severity -->|"HIGH"| T2
Severity -->|"CRITICAL"| T3
T2 -->|"2 failures"| T3
subgraph fallback [" Fallback Chain "]
direction LR
Claude["Claude Code\nprimary"]:::claudeNode
Gemini["Gemini CLI\nfallback"]:::geminiNode
OpenCode["OpenCode\nlast resort"]:::opencodeNode
Claude -->|"rate limited"| Gemini -->|"unavailable"| OpenCode
end
T2 -.->|"dispatch"| Claude
T3 -.->|"dispatch"| Claude
classDef startNode fill:#6366f1,stroke:#4338ca,color:#fff,stroke-width:2px
classDef decisionNode fill:#f59e0b,stroke:#d97706,color:#fff,stroke-width:2px
classDef cacheNode fill:#10b981,stroke:#059669,color:#fff,stroke-width:2px
classDef t1Node fill:#94a3b8,stroke:#64748b,color:#fff,stroke-width:1.5px
classDef t2Node fill:#3b82f6,stroke:#2563eb,color:#fff,stroke-width:1.5px
classDef t3Node fill:#ef4444,stroke:#dc2626,color:#fff,stroke-width:2px
classDef claudeNode fill:#8b5cf6,stroke:#7c3aed,color:#fff,stroke-width:1.5px
classDef geminiNode fill:#f59e0b,stroke:#d97706,color:#fff,stroke-width:1.5px
classDef opencodeNode fill:#14b8a6,stroke:#0d9488,color:#fff,stroke-width:1.5px
classDef subgraphBox fill:transparent,stroke:#e5e7eb,stroke-width:1px,color:#6b7280
class tiers,fallback subgraphBox
Semantic Memory
When a ticket is resolved, SWE Squad extracts structured facts and stores embeddings in pgvector. On future investigations, the top-5 most similar memories are injected as context.
flowchart TD
subgraph store [" Storage — on ticket resolved "]
Resolved(["Ticket Resolved"]):::successNode
Extract["extract_memory_facts\nroot cause, fix, module, tags"]:::extractNode
Embed["embed_ticket\nbge-m3 — 1024 dim"]:::embedNode
Dedup{"Cosine\n> 0.92?"}:::decisionNode
StoreDB[("Supabase\npgvector")]:::dbNode
Resolved --> Extract --> Embed --> Dedup
Dedup -->|"new"| StoreDB
Dedup -->|"duplicate"| StoreDB
end
subgraph retrieve [" Retrieval — on investigation "]
NewTicket(["New Ticket"]):::startNode
Search["find_similar\nTop-5, cosine >= 0.75\n180-day TTL"]:::searchNode
Inject["Inject as\nSemantic Memory context"]:::injectNode
NewTicket --> Search -->|"query"| StoreDB
StoreDB -->|"matches"| Inject
end
classDef successNode fill:#10b981,stroke:#059669,color:#fff,stroke-width:2px
classDef startNode fill:#6366f1,stroke:#4338ca,color:#fff,stroke-width:2px
classDef extractNode fill:#f59e0b,stroke:#d97706,color:#fff,stroke-width:1.5px
classDef embedNode fill:#8b5cf6,stroke:#7c3aed,color:#fff,stroke-width:1.5px
classDef decisionNode fill:#f59e0b,stroke:#d97706,color:#fff,stroke-width:2px
classDef dbNode fill:#3ecf8e,stroke:#2da66e,color:#fff,stroke-width:2px
classDef searchNode fill:#3b82f6,stroke:#2563eb,color:#fff,stroke-width:1.5px
classDef injectNode fill:#8b5cf6,stroke:#7c3aed,color:#fff,stroke-width:1.5px
classDef subgraphBox fill:transparent,stroke:#e5e7eb,stroke-width:1px,color:#6b7280
class store,retrieve subgraphBox
Plugin Architecture
Every external service is a swappable plugin behind an interface:
| Component | Interface | Default | Alternatives |
|---|---|---|---|
| Coding agent | CodingEngine |
Claude Code CLI | Gemini CLI, OpenCode, Copilot |
| Notifications | NotificationProvider |
Telegram | Slack, webhook, email |
| Issue tracker | IssueTracker |
GitHub Issues | Jira, Linear, GitLab |
| Embeddings | EmbeddingProvider |
bge-m3 | OpenAI, sentence-transformers |
| Vector store | VectorStore |
Supabase pgvector | Qdrant, Weaviate, Chroma |
| Task queue | TaskQueueProvider |
In-memory (heapq) | Redis, RabbitMQ, SQS |
| Workspace | WorkspaceProvider |
git-worktree | Docker volume, cloud VM |
| Sandbox | SandboxProvider |
Local subprocess | Docker, Codespaces |
New provider = new file in providers/<domain>/ + config entry. Nothing else changes.
Project Structure
control-plane/ # TypeScript V2 control plane
src/
main.ts # Daemon entry point — pi-agent session + heartbeat
config/
schemas.ts # Zod schemas for all config sections
loader.ts # YAML + env var config loader
tools/ # 16 custom pi-agent tools
ticket-list.ts # Query tickets by status/severity/repo
ticket-create.ts # Create tickets with fingerprint dedup
ticket-update.ts # Update status/notes/assignee
github-issues.ts # List GitHub issues
github-import.ts # Import issues as tickets
delegate-investigation.ts # Spawn investigation via engine
delegate-development.ts # Spawn development + PR creation
delegate-review.ts # Spawn PR review
run-tests.ts # Execute test suite
approve-pr.ts # Approve PR via GitHub API
merge-pr.ts # Merge approved PRs
manage-workspace.ts # Git worktree provisioning
check-stability.ts # Safety gate evaluation
check-health.ts # System health snapshot
check-metrics.ts # Pipeline metrics
send-notification.ts # Notification dispatch
providers/ # Provider implementations
supabase/ # Supabase client + ticket store
notification/ # Telegram, Slack, webhook
engine/ # Coding engine registry
memory/ # Memory service providers
safety/ # Circuit breaker, outcome tracker
services/ # Memory service, workspace manager
shared/ # Engine resolver, prompt builder, context
extensions/ # Tool guard, RBAC, cost tracking
tests/ # 900+ vitest tests (unit + integration)
src/swe_team/ # Python agent library
monitor_agent.py # Log scanning, error detection
triage_agent.py # Severity routing
investigator.py # Root-cause analysis via Claude CLI
developer.py # Keep/discard fix loop
ralph_wiggum.py # Stability gate
supabase_store.py # Supabase ticket store
embeddings.py # bge-m3 embeddings + fact extraction
guardrails.py # Safety gate coordinator
cost_tracker.py # Budget enforcement
atomic_checkout.py # Cross-VM task dedup
... # 30+ modules total
src/a2a/ # A2A inter-agent protocol
server.py, client.py, dispatch.py
ui/ # React + Vite management dashboard
scripts/ops/ # Operational scripts
swe_team_runner.py # Legacy Python runner (cron/daemon)
swe_cli.py # CLI tool (status, tickets, reports)
propagate.sh # Code propagation to worker nodes
config/
swe_team.yaml # Runtime configuration
swe_team/programs/ # Prompt templates (investigate.md, fix.md)
.pi/
skills/swe-manager/SKILL.md # LLM persona definition
extensions/ # pi-agent extension stubs
tests/ # 5900+ pytest tests
Multi-Team Deployment
SWE Squad supports multiple teams sharing infrastructure:
| Team | VM | Role | Engine |
|---|---|---|---|
| alpha | primary |
Senior: QA, merge authority, critical fixes | Claude CLI (direct) |
| beta | worker-1 |
Development: bulk features, bug fixes | Claude CLI (proxy) |
| gamma | worker-2 |
Economy: investigation, triage | Claude CLI (proxy) |
Each team has its own team_id scoping all tickets, a dedicated GitHub bot account, and isolated VM.
Safety
- Circuit Breaker — trips at 80% failure rate, pauses daemon for 30 minutes
- Stability Gate — blocks new work when critical tickets are open or tests are failing
- Outcome Tracker — max 3 investigation/development attempts per ticket before HITL escalation
- Budget Enforcement — per-agent cost tracking with configurable hard-stops
- RBAC — role-based access control on tool invocations (bypass mode by default)
- Bot Containment — each bot account is confined to its designated VM
WebUI
The React management dashboard provides:
- Dashboard — real-time ticket metrics, PR pipeline, severity donut, cost trends
- Tickets — Kanban board with drag-and-drop, search/filter, detail views
- Teams — live status indicators, VM connectivity checks, start/stop controls
- Engines — coding engine management with health checks and BYOK support
- Pipeline Editor — visual workflow editor built on React Flow
- Settings — governance thresholds, cycle config, memory settings
cd ui && npm install && npm run dev
# Opens at http://localhost:5173, proxies API to :8888
Requirements
- Node.js 20+ + pnpm — TypeScript control plane
- Python 3.10+ — agent library and tests
- Claude Code CLI — coding engine
- GitHub CLI (
gh) — authenticated for issue + PR management - Supabase — ticket store + semantic memory (pgvector)
- Telegram bot (optional) — notifications
- SSH access to worker VMs (optional) — remote log collection
Roadmap
- Persistent pi-agent daemon with 16 custom tools
- Engine-agnostic delegation (Claude CLI, Gemini CLI, Copilot, OpenCode)
- Semantic memory with pgvector embeddings + confidence tracking
- Full ticket pipeline: import, investigate, develop, review, merge
- Safety gates: circuit breaker, stability gate, outcome tracker
- React WebUI with Kanban, pipeline editor, team management
- Multi-team deployment (alpha/beta/gamma squads)
- Provider-agnostic plugin architecture
- Interactive Telegram bot — bidirectional chatbot for remote control (#1034)
- Multi-VM deployment automation
- npm package:
@swe-squad/control-plane - Public repo sync and launch
- Slack/Discord notification plugins
- Metrics and observability (Prometheus/Grafana)
- Automated benchmarking suite
Contributing
We welcome contributions! Areas where help is most valuable:
- Additional coding engine adapters
- Notification channel plugins (Slack, Discord)
- Interactive Telegram bot (#1034)
- New ticket store backends (Redis, SQLite)
- Agent prompt optimization and benchmarking
- Documentation and tutorials
License
MIT — use it, fork it, build on it.
Yorumlar (0)
Yorum birakmak icin giris yap.
Yorum birakSonuc bulunamadi