claude-octopus

agent
SUMMARY

Multi-LLM orchestration plugin for Claude Code — 8 providers (Codex, Gemini, Claude, Perplexity, OpenRouter, Copilot, Qwen, Ollama), 47 commands, 50 skills, Double Diamond workflows

README.md

🐙 Claude Octopus

Every AI model has blind spots. Claude Octopus puts up to eight of them on every task, so blind spots surface before you ship — not after. It orchestrates Codex, Gemini, Copilot, Qwen, Ollama, Perplexity, and OpenRouter alongside Claude Code, with consensus gates that flag any disagreements.

Claude Octopus Demo — debate and research with multiple AI providers

Built with Claude Tests 146 tests passing Version 9.16.0 Requires Claude Code v2.1.83+ MIT License

🐙 Research, build, review, and ship — with eight AI providers checking each other's work. Say what you need, and the right workflow runs. A 75% consensus gate catches disagreements before they reach production. No single model's blind spots slip through.

🧠 Remembers across sessions. Integrates with claude-mem for persistent memory — past decisions, research, and context survive session boundaries.

Spec in, software out. Dark Factory mode takes a spec and autonomously runs the full pipeline — research, define, develop, deliver. You review the output, not every step.

🔄 Four-phase methodology, not just tools. Every task moves through Discover → Define → Develop → Deliver, with quality gates between phases. Other orchestrators give you infrastructure. Octopus gives you the workflows.

🐙 32 specialized personas (role-specific AI agents like security-auditor, backend-architect), 47 commands (slash commands you type), 50 skills (reusable workflow modules). Say "audit my API" and the right expert activates. Don't know the command? The smart router figures it out.

🐙 Works with just Claude. Scales to eight. Zero providers needed to start. Add them one at a time — each activates automatically when detected.

💰 Five providers cost nothing extra. Codex and Gemini use OAuth (included with subscriptions). Qwen has 1,000-2,000 free requests/day. Copilot uses your GitHub subscription. Ollama runs locally for free.


What's New

Version Best Features
v9 (current) Up to 8 providers (Codex, Gemini, Copilot, Qwen, Ollama, Perplexity, OpenRouter, OpenCode). Four-way AI debates. Smart router — just say what you need. Circuit breakers with automatic provider recovery. Loop self-regulation stops runaway agents. HUD statusline with tool tracking. Codex CLI cross-compatibility. Cache-aligned prompts for 90% cost savings on repeated calls.
v8 Multi-LLM code review with inline PR comments. Parallel workstreams in isolated git worktrees. Reaction engine — auto-responds to CI failures. 32 specialized personas. Dark Factory autonomous pipeline.
v7 Double Diamond workflow. Multi-provider dispatch. Quality gates and consensus scoring. Configurable sandbox modes.

Full changelog →

Quickstart

# Terminal (not inside a Claude Code session):
claude plugin marketplace add https://github.com/nyldn/claude-octopus.git
claude plugin install octo@nyldn-plugins

# Then inside Claude Code:
/octo:setup

That's it. Setup detects installed providers, shows what's missing, and walks you through configuration. You need zero external providers to start — Claude is built in.

Alternative install methods

From the Claude Code UI: Type /plugin in a session → Marketplace tab → install octo.

Factory AI (Droid):

droid plugin marketplace add https://github.com/nyldn/claude-octopus
droid plugin install octo@claude-octopus
Update / Troubleshooting
# Update
claude plugin update octo

# Clean reinstall (if update fails)
claude plugin uninstall claude-octopus 2>/dev/null
claude plugin uninstall octo 2>/dev/null
rm -rf ~/.claude/plugins/cache/nyldn-plugins/claude-octopus
claude plugin marketplace add https://github.com/nyldn/claude-octopus.git
claude plugin install octo@nyldn-plugins

8 Commands That Matter Most

🐙 Eight commands — one per arm. A real octopus has eight arms, each with its own neurons that can act independently. These eight tentacles work the same way: each orchestrates up to three AI providers, applies quality gates, and produces a deliverable.

/octo:embrace build stripe integration     # Full lifecycle: research → define → develop → deliver
/octo:factory "build a CLI that converts CSV to JSON"  # Autonomous pipeline — spec in, software out
/octo:debate monorepo vs microservices     # Structured four-way AI debate with consensus
/octo:research htmx vs react in 2026       # Multi-source synthesis from three AI providers
/octo:design mobile checkout redesign       # UI/UX design with BM25 style intelligence
/octo:tdd create user auth                 # Red-green-refactor with test discipline
/octo:security                              # OWASP vulnerability scan + remediation
/octo:prd mobile checkout redesign          # AI-optimized PRD with 100-point scoring

Plus 30 more: review, debug, extract, deck, docs, schedule, parallel, sentinel, brainstorm, claw, doctor, and the full set.

Don't remember the command name? Just describe what you need:

/octo:auto research microservices patterns    -> routes to discover phase
/octo:auto build user authentication          -> routes to develop phase
/octo:auto compare Redis vs DynamoDB          -> routes to debate

The smart router parses your intent and selects the right workflow.


Pick a Command by Goal

Not sure which command to use? Pick by goal:

I want to... Use
Research a topic thoroughly /octo:research or /octo:discover
Debate two approaches /octo:debate
Build a feature end-to-end /octo:embrace
Design a UI or style system /octo:design
Review existing code /octo:review
Write tests first, then code /octo:tdd
Scan for vulnerabilities /octo:security
Write a product spec /octo:prd
Go from spec to shipping code /octo:factory
Debug a tricky issue /octo:debug
Just run something quick /octo:quick

Or skip the table — type /octo:auto <what you want> or just say octo <what you want>, and the smart router picks for you. 🔍

How does this compare to Superpowers or plain Claude Code?
Claude Code alone Superpowers Claude Octopus
Core idea One model, your prompts Structured methodology for one agent Up to 8 providers cross-checking each other
Providers Claude only Claude only Codex, Gemini, Copilot, Qwen, Ollama, Perplexity, OpenRouter, OpenCode
Workflow Ad-hoc Spec → plan → subagent-driven dev Discover → Define → Develop → Deliver (Double Diamond)
Strength Simple, no setup Long autonomous runs with discipline Multiple perspectives catching blind spots
Consensus gates No No Yes — 75% agreement threshold
Best for Quick tasks, simple features Large builds with clear specs Research, review, debates, multi-provider validation
Setup Nothing Install plugin Install plugin, optionally add providers

tl;dr: Superpowers makes one agent work really well for hours. Octopus makes multiple agents check each other's work. They solve different problems.


How It Works

How 8 Providers Work Together

Claude Octopus coordinates up to eight AI providers — one per tentacle:

Provider Role
🔴 Codex (OpenAI) Implementation depth — code patterns, technical analysis, architecture
🟡 Gemini (Google) Ecosystem breadth — alternatives, security review, research synthesis
🟣 Perplexity Live web search — CVE lookups, dependency research, current docs
🌐 OpenRouter Alternative model routing — access 100+ models via single API
🟢 Copilot (GitHub) Zero-cost research — uses existing GitHub Copilot subscription
🟤 Qwen (Alibaba) Free-tier research — 1,000-2,000 requests/day via Qwen OAuth
⚫ Ollama (Local) Zero-cost local LLM — offline, privacy-sensitive, fallback
🔵 Claude (Anthropic) Orchestration — quality gates, consensus building, final synthesis

Providers run in parallel for research, sequentially for problem scoping, and adversarially for review. A 75% consensus quality gate prevents questionable work from shipping. Only Claude is required — all others are optional and auto-detected.

Four Phases: Discover, Define, Develop, Deliver

Four structured phases adapted from the UK Design Council's methodology:

Phase Command What happens
Discover /octo:discover Multi-AI research and broad exploration
Define /octo:define Requirements clarification with consensus
Develop /octo:develop Implementation with quality gates
Deliver /octo:deliver Adversarial review and go/no-go scoring

Run phases individually or all four with /octo:embrace. Configure autonomy: supervised (approve each phase), semi-autonomous (intervene on failures), or autonomous (run all four).

32 Specialist Personas

Specialized agents that activate automatically based on your request. When you say "audit my API for vulnerabilities," security-auditor activates. When you say "design a dashboard," ui-ux-designer takes over.

Categories: Software Engineering (11), Specialized Development (6), Documentation & Communication (5), Research & Strategy (3), Business & Compliance (3), Creative & Design (4).

Full persona reference | All 50 skills

Built-in Reaction Engine

When agents create PRs, the reaction engine monitors what happens next — CI failures, review comments, stale agents — and responds automatically. No new commands to learn. It fires transparently inside workflows you already use:

Integration Point When It Fires
/octo:parallel Between poll cycles while monitoring work packages
/octo:sentinel After triage scan completes
agent-registry.sh health --react On-demand health check

What it auto-handles:

Event Reaction Limits
CI failure Collects failure logs into agent inbox 3 retries, escalates after 30m
Changes requested Collects review comments into agent inbox 2 retries, escalates after 60m
Agent stuck Escalates to human After 15m with no progress
PR approved + CI green Notifies you it's ready to merge
PR merged Marks agent complete

Override defaults per project by creating .octo/reactions.conf:

# EVENT|ACTION|MAX_RETRIES|ESCALATE_AFTER_MIN|ENABLED
ci_failed|forward_logs|5|45|true
changes_requested|forward_comments|3|90|true
stuck|escalate|0|10|true

Reactions track 13 agent lifecycle states: runningpr_openci_pendingci_failed / review_pendingchanges_requested / approvedmergeablemergeddone.


Providers and What They Cost

Authentication

Method Codex Gemini Claude
OAuth (recommended) codex login — included in ChatGPT subscription Google account — included in AI subscription Built into Claude Code
API key OPENAI_API_KEY — per-token billing GEMINI_API_KEY — per-token billing Built into Claude Code

OAuth users pay nothing beyond their existing subscriptions.

What You Get With Just Claude

Everything except multi-AI features. You get all 32 personas, structured workflows, smart routing, context detection, and every skill. Multi-AI orchestration (parallel analysis, debate, consensus) activates when external providers are configured.


Trust, Safety, and Limits

Namespace isolation — Only /octo:* commands and octo natural language prefix activate the plugin. Your existing Claude Code setup is untouched.

Data locations — Results in ~/.claude-octopus/results/, logs in ~/.claude-octopus/logs/, project state in .octo/. Nothing hidden.

Provider transparency — Every command shows a 🐙 activation indicator on launch. Colored dots (🔴 🟡 🟣 🔵) show exactly which providers are running and when external APIs are called. You always know what's happening.

Clean uninstall — Run claude plugin uninstall octo from your terminal. If you see a scope error, add --scope project. No residual config changes.


Works With OpenClaw

Claude Octopus ships with a compatibility layer for OpenClaw, the open-source AI assistant framework. This lets you expose Octopus workflows to messaging platforms (Telegram, Discord, Signal, WhatsApp) without modifying the Claude Code plugin.

Architecture

Claude Code Plugin (unchanged)
  └── .mcp.json ─── MCP Server ─── orchestrate.sh
                                        ↑
OpenClaw Extension ─────────────────────┘

Three components, zero changes to the core plugin:

Component Location Purpose
MCP Server mcp-server/ Exposes 10 Octopus tools via Model Context Protocol
OpenClaw Extension openclaw/ Wraps workflows for OpenClaw's extension API
Skill Schema mcp-server/src/schema/skill-schema.json Universal skill metadata format

MCP Server

The MCP server auto-starts when the plugin is enabled (via .mcp.json). It exposes:

  • octopus_discover, octopus_define, octopus_develop, octopus_deliver — Individual phases
  • octopus_embrace — Full Double Diamond workflow
  • octopus_debate, octopus_review, octopus_security — Specialized workflows
  • octopus_list_skills, octopus_status — Introspection

Any MCP-compatible client can connect to the server.

OpenClaw Extension

Install in an OpenClaw instance from git:

npm install github:nyldn/claude-octopus#main --prefix openclaw

Or clone and link locally:

cd openclaw && npm install && npm run build

The extension registers as an OpenClaw plugin with configurable workflows, autonomy modes, and Claude Code path resolution.

Build & Validate

./scripts/build-openclaw.sh          # Regenerate skill registry from frontmatter
./scripts/build-openclaw.sh --check  # CI mode — exits non-zero if out of sync
./tests/validate-openclaw.sh         # 13-check validation suite

FAQ

Do I need all three AI providers?
No. One external provider plus Claude gives you multi-AI features. No external providers still gives you personas, workflows, and skills.

Will this break my existing Claude Code setup?
No. Activates only with the octo prefix. Results stored separately. Uninstalls cleanly.

What happens if a provider times out?
The workflow continues with available providers. You'll see the status in the visual indicators.

Why "octopus"?
🐙 Fun fact: a real octopus has three hearts, blue blood, and 500 million neurons — two-thirds of which live in its eight arms. Each arm can taste, touch, and act independently. Claude Octopus works the same way: each tentacle (command) operates autonomously with its own squeeze of logic, then ink flows back as the final deliverable. The crossfire review? That's the squeeze — adversarial pressure that untangles everything before it ships.


Community

Join r/ClaudeOctopus for help, workflow tips, showcases, and updates.

Star History Chart

Contributing

  1. Report issues
  2. Submit PRs following existing code style
  3. git clone https://github.com/nyldn/claude-octopus.git && make test

See CONTRIBUTING.md for details.


Documentation


Attribution


License

MIT — see LICENSE

nyldn | MIT License | r/ClaudeOctopus | Report Issues

Yorumlar (0)

Sonuc bulunamadi