orcasynth

mcp
Security Audit
Warn
Health Warn
  • License — License: MIT
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Low visibility — Only 5 GitHub stars
Code Pass
  • Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Pass
  • Permissions — No dangerous permissions requested

No AI report is available for this listing yet.

SUMMARY

Self-hosted daemon that orchestrates autonomous AI coding agents (Claude Code, OpenCode, Codex) — autopilot missions, guardrails, REST API, CLI, and a real-time web UI.

README.md

Orcasynth

Control autonomous coding agents — without losing control.

Plan the work, launch isolated coding agents, watch every session live, and step in
before a risky change ever reaches your codebase.

Plan · Dispatch · Observe · Intervene

Orcasynth is a self-hosted daemon that orchestrates autonomous coding agents
(Claude Code, OpenCode, Codex, Kilo Code, Pi, oh-my-pi) in isolated tmux sessions — with
a REST API, a CLI, and a real-time Next.js web UI. No SaaS, no lock-in: your machine, your
agents, your code.

CI
License: MIT
Node
PRs Welcome


Why Orcasynth

Coding agents are powerful but messy to run at scale: one terminal per agent, no shared
view of what's happening, and no safety net when an agent decides to rm -rf something.

Orcasynth puts a control plane in front of them. Hand it a goal and it plans the work,
spawns the right agent for each step in its own tmux session, streams every keystroke to
your browser, and gates dangerous actions behind a human when you want it to. When you
trust it more, you turn the autonomy up; when you trust it less, you turn it down.

What it does

  • Autopilot planning. Give the Pilot a goal and an LLM decomposes it into a dependency
    DAG of phases, chains them by dependency, and can name an agent per phase. Phases only
    start once the phases they depend on are done — and independent phases run in parallel up
    to your session limit, each in its own isolated worktree, instead of a forced linear chain.
  • Per-model descriptions & per-phase model selection. Write a capability description
    for each model in Settings, flip on "Autopilot picks the model," and the planner chooses
    the best-suited model for each phase from those descriptions — validated against your
    allow-list, falling back to the default on anything invalid.
  • PR-native autopilot. Instead of editing your checkout mid-flight, a mission can run
    like a disciplined engineer on a branch: it works in an isolated git worktree (under
    <repo-parent>/.orca-worktrees/), commits each approved phase as it lands, then runs your
    verify command, pushes the branch, and opens a GitHub pull request. Orca ingests the PR's
    review feedback — CHANGES_REQUESTED and COMMENTED reviews (so bot reviewers and plain
    human comments both count) plus inline diff comments — folds it back through the Pilot as
    1..N fix phases, and pushes the fixes to the same PR. A fix-round budget (2 automatic
    rounds) stops review ping-pong loops: once it's spent, the mission stalls and escalates to
    a human instead of looping forever. The UI exposes a PR badge, the fix-round count, and
    merge-to-main / continue-mission actions; merging gates on the PR being open, mergeable,
    and CI-green. Auth uses a configured GitHub token or falls back to the machine's gh CLI
    login. Configurable globally in Settings → GitHub and overridable per project
    (Inherit / On / Off), so each project runs its own workflow.
  • Agent-agnostic spawning. Runs Claude Code, OpenCode, Codex, Kilo Code, Pi, or
    oh-my-pi in isolated tmux sessions, configurable per task — as workers and as the
    autopilot's Pilot/Overseer. Each provider is a first-class executor with its own brand
    icon and launch flags in Settings → Providers, and each agent receives the task
    context and closes its own task when it's done.
  • Autonomy levels (L0–L3). Choose how much rope each mission gets — from
    L0 · Recommend (plan only, nothing runs until you approve) through L1 · Assist
    and L2 · Pilot to L3 · Auto (full autonomy). The overseer's decision engine
    auto-clears agent permission prompts when confidence is high and the action is safe, and
    escalates anything destructive or uncertain to a human. Operations like rm -rf, dropping
    tables, force-pushes, or touching .env always escalate, whatever the level.
  • Live web UI with one-click intervention. Tasks, a kanban board with a calendar (and
    date-range filtering), missions with phase progress, a timeline with an activity feed and a
    "changes over time" view that turns recent git history into an interactive commit stream
    plus a most-active-files roll-up, an escalations queue for review-gate decisions, and
    real-time tmux session previews you can jump into and take over. The Pilot's planning run
    streams live in the task modal too, expandable into the full session terminal. Each preview
    is a real PTY streamed over a WebSocket (xterm), so you type straight into the agent —
    native cursor, smooth scrolling, full key support — not a read-only mirror. A dedicated Stats page shows per-model token/cost breakdown. Creating a
    project is point-and-click too: a Browse button opens a server-side folder picker to
    choose the path instead of typing it, and you pick a project icon (from an image already
    in the repo) right after creating it. Full EN/CS internationalization built in, and the
    whole dashboard is responsive down to a phone.
  • Phone push notifications. Launch a swarm and walk away from the keyboard — Orca pings
    your phone only when a mission actually needs you. A review escalation, an agent waiting on
    input, or a stalled run arrives as a Web Push notification with inline action buttons
    (Approve / Re-run / Allow / Reject / Open) that act through the service worker without
    opening the app; mission-done / PR-opened arrives as a tap-to-open FYI. Opt in per device
    from your Account, and notifications route to the mission's owner plus admins. A VAPID
    keypair is generated on first boot and the private key never leaves the daemon.
  • Self-healing. A stuck-session detector revives agents that die without closing out
    (and blocks the task after repeated failures instead of crash-looping). A janitor sweeps
    up finished sessions. Live token and cost usage is shown per run.
  • Multi-user RBAC with self-service. Admin and member roles, per-project assignments,
    per-user model allow-lists, profiles and avatars, and a first-run onboarding that needs
    no login until the first admin is created. Users can change their password, upload an
    avatar, and manage push-notification devices from their own account page.
  • Per-user Assistant. Each user gets a persistent assistant agent (orca-advisor-<userId>)
    that drives Orca on their behalf through a built-in MCP server. The server exposes seven
    tools — orca_tasks, orca_create_task, orca_plan, orca_sessions, orca_note_add,
    orca_notes, and the generic orca_request passthrough that reaches any REST endpoint —
    so a brand-new endpoint is callable with zero new tooling. Auto-starts on login, remembers
    its model, and runs in a docked IDE-style side panel with a real-PTY terminal. Pop any
    session terminal out into its own chromeless window for focus.
  • Self-hosted & lightweight. A single SQLite-backed daemon (Hono + SSE) plus a Next.js
    front end. No external services required beyond your own LLM provider.

Screenshots

Dashboard — live agents, active missions, the autopilot spotlight, and recent outcomes at a glance.

Dashboard

Assistant panel — a docked, dock-left/right, resizable IDE-style side column. Watch your always-on AI assistant and a running agent next to the main view, with a model picker that shows per-provider brand icons.

Assistant panel

Tasks — list + detail with live agent output and token usage. Tasks Kanban — open / in-progress / blocked / closed, with mission progress and a calendar. Kanban
Missions — phase graph and task flow for an autopilot run (folded into Tasks). Missions Timeline — a live activity feed across tasks, missions, and signals. Timeline
Sessions — real-time tmux agent previews with one-click intervention. Sessions Terminal — an interactive real-PTY agent terminal you type straight into, including human-in-the-loop approvals. Terminal
Pop-out terminal — pull any session into its own standalone, chromeless window for focus. Pop-out terminal Settings — per-model descriptions with brand icons, providers, autopilot, a dedicated GitHub section for the PR-native workflow, and defaults. Settings
Projects — a built-in Monaco editor with the project file tree. Projects editor

Onboarding — a first-run setup flow that needs no login until the first admin is created.

Onboarding

Install

Install globally from npm — one command brings up the daemon and the web UI:

npm install -g orcasynth
orca            # interactive menu: start/stop · first-run setup · update · open web

Prefer it non-interactive? The same actions are plain subcommands:

orca up         # start the daemon (:4400) + web UI (:4500) in the background
orca status     # show what's running
orca down       # stop everything
orca update     # update to the latest release from npm
orca install    # guided provisioning wizard (domain/TLS, ports, first admin)

Requires Node ≥ 22 and tmux. On first run, orca walks you through a quick
setup — admin account, LLM provider + API key, and a default model. Your data (config,
the SQLite database, and logs) lives in ~/.config/orca/ and survives every update.

Interactive terminals use node-pty, an
optional native dependency. If its native addon can't build on your host, everything
else runs unchanged — the live session previews just fall back to a read-only mirror
instead of a type-into terminal.

Then open http://localhost:4500 and sign in.

Run from source

For development, or to run without a global install. Requires Node ≥ 22 and tmux.

# 1. Daemon (REST API on :4400)
npm install
npm run build
ORCA_BOOTSTRAP_USER=admin ORCA_BOOTSTRAP_PASS=changeme node dist/daemon/index.js

# 2. Web UI (on :4500)
cd web
npm install
npm run build
npm start -- -p 4500

Open http://localhost:4500 and sign in. Configure your LLM provider and models in
Settings → Autopilot / Models, then create a task or engage an autopilot mission.

The CLI talks to the daemon over the REST API and auto-starts it if it isn't running:

node dist/cli/index.js ls          # list tasks
node dist/cli/index.js close <id>  # close a task

How it works

        goal
         │
         ▼
   ┌───────────┐   phases + deps    ┌─────────────┐   spawn    ┌──────────────┐
   │   Pilot   │ ─────────────────► │   Overseer  │ ─────────► │  Agent (tmux) │
   │ (planner) │                    │ (scheduler, │            │ Claude Code / │
   └───────────┘                    │  decisions) │ ◄───────── │ OpenCode /    │
                                    └─────────────┘   signals  │ Codex / Kilo /│
                                          │                    │ Pi / oh-my-pi │
                                          │                    └──────────────┘
                                          │ escalate
                                          ▼
                                    human-in-the-loop

The Pilot decomposes a goal into a dependency-ordered set of phases. The Overseer
schedules ready phases, spawns the right Agent for each one in its own tmux session,
and watches the output. A deriver reads each session and emits signals — working,
needs_input, complete. When an agent hits a permission prompt, the decision engine
either clears it automatically (high confidence, non-destructive, within the mission's
autonomy level) or escalates it to a human.

With the PR-native workflow enabled, that loop runs against an isolated git worktree
rather than your live checkout: the Overseer commits each approved phase, and on completion
verifies, pushes, and opens a pull request. PR review feedback flows back to the Pilot as
fix phases that land on the same branch — bounded by a fix-round budget so the mission
escalates to a human instead of trading comments with a reviewer indefinitely.

Architecture

A daemon (src/) owns the database and the orchestration loop; the web app (web/)
is a thin client over the REST API + SSE event stream.

Layer What lives there
src/store SQLite stores (tasks, missions, agents, config, users, projects, events) via better-sqlite3
src/overseer mission engine, planner, scheduler, decision engine, stuck-detector, janitor
src/spawn · src/tmux agent command building + tmux driver
src/advisor per-user assistant lifecycle (start/stop/autostart) + MCP config injection
src/mcp built-in MCP server exposing Orca's toolset to the assistant agent
src/terminal real-PTY WebSocket streaming (node-pty + tmux attach)
src/deriver derives signals from agent output (working / needs_input / complete)
src/integrations per-executor token/cost usage extraction, Hermes MCP registration, CLI detection
src/api Hono REST server + SSE event bus
src/cli · src/daemon the orca CLI (incl. orca api passthrough) and the daemon entrypoint
web/modules feature modules (tasks, kanban, sessions, timeline, projects, advisor, settings, …)

See docs/ for the documentation hub, API,
architecture, concepts, CLI,
development, deployment, and web UI guides.

Development

npm test            # daemon tests (vitest)
npm run build       # typecheck + build (also copies schema.sql + prompts/)
npm run build:web   # standalone web UI bundle
npm run serve       # daemon dev mode (direct TS via --experimental-strip-types)
npm run lint        # ESLint (unused imports, hook deps)
npm run depcruise   # dependency-cruiser architecture checks (no cycles, layer boundaries)
cd web && npm test  # web tests
cd web && npm run dev  # web dev server (turbopack)

See docs/DEVELOPMENT.md, docs/TESTING.md,
and docs/WEB.md.

Contributing

Contributors are welcome — whether it's a bug fix, a new feature, or just an idea.

Star the repo if you find it useful — it helps others discover the project.

License

MIT

Reviews (0)

No results found