Hail_Hydra

agent
Security Audit
Fail
Health Pass
  • License Ò€” License: MIT
  • Description Ò€” Repository has a description
  • Active repo Ò€” Last push 0 days ago
  • Community trust Ò€” 41 GitHub stars
Code Fail
  • fs module Ò€” File system access in .github/workflows/publish-gpr.yml
  • fs module Ò€” File system access in hooks/hydra-auto-guard.js
  • child_process Ò€” Shell command execution capability in hooks/hydra-check-update.js
  • execSync Ò€” Synchronous shell command execution in hooks/hydra-check-update.js
  • os.homedir Ò€” User home directory access in hooks/hydra-check-update.js
  • fs module Ò€” File system access in hooks/hydra-check-update.js
  • child_process Ò€” Shell command execution capability in hooks/hydra-notify.js
  • os.homedir Ò€” User home directory access in hooks/hydra-notify.js
  • fs module Ò€” File system access in hooks/hydra-notify.js
  • os.homedir Ò€” User home directory access in hooks/hydra-statusline.js
  • fs module Ò€” File system access in hooks/hydra-statusline.js
  • process.env Ò€” Environment variable access in npm-package/bin/cli.js
  • fs module Ò€” File system access in npm-package/files/hooks/hydra-auto-guard.js
  • child_process Ò€” Shell command execution capability in npm-package/files/hooks/hydra-check-update.js
  • execSync Ò€” Synchronous shell command execution in npm-package/files/hooks/hydra-check-update.js
  • os.homedir Ò€” User home directory access in npm-package/files/hooks/hydra-check-update.js
  • fs module Ò€” File system access in npm-package/files/hooks/hydra-check-update.js
  • child_process Ò€” Shell command execution capability in npm-package/files/hooks/hydra-notify.js
  • os.homedir Ò€” User home directory access in npm-package/files/hooks/hydra-notify.js
  • fs module Ò€” File system access in npm-package/files/hooks/hydra-notify.js
  • os.homedir Ò€” User home directory access in npm-package/files/hooks/hydra-statusline.js
  • fs module Ò€” File system access in npm-package/files/hooks/hydra-statusline.js
  • os.homedir Ò€” User home directory access in npm-package/src/display.js
  • fs module Ò€” File system access in npm-package/src/display.js
Permissions Pass
  • Permissions Ò€” No dangerous permissions requested
Purpose
This tool is a speculative execution framework for Claude Code that dispatches sub-tasks to multiple AI agents (using cheaper/faster models) to accelerate workflows and reduce costs.

Security Assessment
Overall Risk: Medium. The framework relies heavily on local system access to function, but the breadth of access requires careful scrutiny. There are no hardcoded secrets detected, and no dangerous permissions are requested. However, the tool routinely accesses the user's home directory and reads/writes to the file system across multiple scripts. Most notably, it actively executes shell commandsβ€”both synchronously and asynchronouslyβ€”via child processes in its update and notification hooks. These hooks run automatically in the background ("Always On" mode). While this is likely used for checking versions and modifying status lines, auto-executing shell commands combined with environment variable access represent a significant local attack surface.

Quality Assessment
Quality is strong. The project is actively maintained (last push was today) and is cleanly licensed under the permissive MIT license. It has decent community engagement for a niche developer tool, currently sitting at 41 GitHub stars.

Verdict
Use with caution. Ensure you thoroughly inspect the shell execution logic in the hooks before installing, as the auto-running background agents have broad system access.
SUMMARY

πŸ‰ Hail Hydra β€” Multi-headed speculative execution framework for Claude Code. 7 AI agents, 3x faster, ~50% cheaper. Inspired by speculative decoding.

README.md

Hail Hydra

πŸ‰ H Y D R A

Multi-Headed Speculative Execution for Claude Code

"Cut off one head, two more shall take its place."
Except here β€” every head is doing your work faster and cheaper.

Opus Sonnet Haiku

npm version npm downloads

Speed Cost Quality Always On

9 agents Β Β·Β  9 slash commands Β Β·Β  4 hooks Β Β·Β  ~50% cost savings Β Β·Β  Codebase map Β Β·Β  Persistent memory Β Β·Β  Integration integrity


🧬 What is Hydra?

You know how in the movies, Hydra had agents embedded everywhere, silently getting things done in the background? That's exactly what this framework does for your Claude Code sessions.

Hydra is a task-level speculative execution framework inspired by Speculative Decoding in LLM inference. Instead of making one expensive model (Opus 4.6) do everything β€” from searching files to writing entire modules β€” Hydra deploys a team of specialized "heads" running on faster, cheaper models that handle the grunt work.

The result? Opus becomes a manager, not a laborer. It classifies tasks, dispatches them to the right head, glances at the output, and moves on. The user never notices. It's invisible. It's always on.

New in v2.0.0: Every agent now has persistent memory β€” they learn
your codebase patterns, conventions, and architectural decisions across
sessions. The orchestrator (Opus) also maintains its own memory of fragile
zones and routing decisions. Plus, the new hydra-sentinel automatically
catches integration breakage after code changes β€” and code isn't presented
to you until verification completes.

New in v2.1.0: The Codebase Map gives every agent instant access to
file dependencies, blast radius, risk scores, and test coverage β€” replacing
slow grep-based scanning with instant JSON lookups. Sentinel is now 3-5Γ—
faster and 3-5Γ— cheaper per scan.

Four built-in speed optimizations reduce overhead at every stage: speculative pre-dispatch
(scout launches in parallel with task classification), session indexing (codebase context
persists across turns β€” no re-exploration), parallel dispatch (independent agents run
simultaneously β€” Opus waits for all before responding), and confidence-based auto-accept (raw
factual outputs skip Opus review entirely).

Think of it this way:

Would you hire a $500/hr architect to carry bricks? No. You'd have them design the building and let the crew handle construction. That's Hydra.


πŸš€ Installation

One command. Done.

npx hail-hydra-cc@latest

[OR]

npm i hail-hydra-cc@latest

Runs the interactive installer β€” deploys 9 agents, 9 slash commands, 4 hooks, and registers
the statusline and update checker. Done in seconds.

Manual Install

# Clone the repo
git clone https://github.com/AR6420/Hail_Hydra.git
cd hydra

# Deploy heads globally (recommended β€” always on, every project)
./scripts/install.sh --user

# πŸ‰ Hail Hydra! Framework active in all Claude Code sessions.
# βœ… 9 agents  βœ… 9 commands  βœ… 4 hooks  βœ… StatusLine  βœ… VERSION

Installation Options

# User-level β€” available in ALL your Claude Code projects
./scripts/install.sh --user

# Project-level β€” just this one project
./scripts/install.sh --project

# Both β€” maximum coverage
./scripts/install.sh --both

# Check what's deployed
./scripts/install.sh --status

# Remove everything
./scripts/install.sh --uninstall

What Gets Installed

~/.claude/
β”œβ”€β”€ agents/                      # 9 agent definitions (all with memory: project)
β”‚   β”œβ”€β”€ hydra-scout.md           # 🟒 Haiku 4.5 β€” explore codebase
β”‚   β”œβ”€β”€ hydra-runner.md          # 🟒 Haiku 4.5 β€” run tests/builds
β”‚   β”œβ”€β”€ hydra-scribe.md          # 🟒 Haiku 4.5 β€” write documentation
β”‚   β”œβ”€β”€ hydra-guard.md           # 🟒 Haiku 4.5 β€” security/quality gate
β”‚   β”œβ”€β”€ hydra-git.md             # 🟒 Haiku 4.5 β€” git operations
β”‚   β”œβ”€β”€ hydra-sentinel-scan.md   # 🟒 Haiku 4.5 β€” fast integration sweep
β”‚   β”œβ”€β”€ hydra-coder.md           # πŸ”΅ Sonnet 4.6 β€” write/edit code
β”‚   β”œβ”€β”€ hydra-analyst.md         # πŸ”΅ Sonnet 4.6 β€” debug/diagnose
β”‚   └── hydra-sentinel.md        # πŸ”΅ Sonnet 4.6 β€” deep integration analysis
β”œβ”€β”€ commands/hydra/              # 9 slash commands
β”‚   β”œβ”€β”€ help.md                  # /hydra:help
β”‚   β”œβ”€β”€ status.md                # /hydra:status
β”‚   β”œβ”€β”€ update.md                # /hydra:update
β”‚   β”œβ”€β”€ config.md                # /hydra:config
β”‚   β”œβ”€β”€ guard.md                 # /hydra:guard
β”‚   β”œβ”€β”€ quiet.md                 # /hydra:quiet
β”‚   β”œβ”€β”€ verbose.md               # /hydra:verbose
β”‚   β”œβ”€β”€ report.md                # /hydra:report
β”‚   └── map.md                   # /hydra:map
β”œβ”€β”€ hooks/                       # 4 lifecycle hooks
β”‚   β”œβ”€β”€ hydra-check-update.js    # SessionStart β€” version check (background)
β”‚   β”œβ”€β”€ hydra-statusline.js      # StatusLine β€” status bar display
β”‚   β”œβ”€β”€ hydra-auto-guard.js      # PostToolUse β€” file change tracker
β”‚   β”œβ”€β”€ hydra-notify.js          # Notification β€” task completion sound
β”‚   └── hydra-task-complete.wav  # Notification sound file
└── skills/
    └── hydra/                   # Skill (Claude Code discoverable)
        β”œβ”€β”€ SKILL.md             # Orchestrator instructions
        β”œβ”€β”€ VERSION              # Installed version number
        β”œβ”€β”€ config/
        β”‚   └── hydra.config.md  # User configuration (created by --config)
        └── references/
            β”œβ”€β”€ model-capabilities.md
            └── routing-guide.md

> **Note:** `settings.json` is at `~/.claude/settings.json` β€” hooks and statusLine are registered there.

Project-level (--project): same files written to .claude/ in your working
directory instead of ~/.claude/. Project-level takes precedence when both exist.


⚑ Slash Commands

Command Description
/hydra:help Show all commands and agents
/hydra:status Show installed agents, version, and update availability
/hydra:update Update Hydra to the latest version
/hydra:config Show current configuration
/hydra:guard [files] Run manual security & quality scan
/hydra:quiet Suppress dispatch logs for this session
/hydra:verbose Enable verbose dispatch logs with timing
/hydra:report Report a bug, request a feature, or share feedback
/hydra:map View codebase dependency map, query blast radius, rebuild

πŸ–₯️ Status Line

After installation, your Claude Code status bar shows real-time framework info:

πŸ‰ β”‚ Opus β”‚ Ctx: 37% β–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘ β”‚ $0.42 β”‚ my-project
Element What It Shows
πŸ‰ Hydra is active
Model Current Claude model (Opus, Sonnet, Haiku)
Ctx: XX% Context window usage with visual bar
$X.XX Session API cost so far
Directory Current working directory
⚠ Warning Compaction warning (only at 70%+ context usage)

Context bar colors:

  • 🟒 Green (0–49%) β€” plenty of room
  • 🟑 Yellow (50–79%) β€” getting full, consider /compact
  • πŸ”΄ Red (80%+) β€” context nearly full, /compact or /clear recommended

Compaction warnings (appended automatically at 70%+):

πŸ‰ β”‚ Opus β”‚ Ctx: 73% β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘ β”‚ $1.87 β”‚ my-project β”‚ ⚠ Auto-compact at 85%
πŸ‰ β”‚ Opus β”‚ Ctx: 83% β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘ β”‚ $3.14 β”‚ my-project β”‚ ⚠ Compacting soon!
  • ⚠ Auto-compact at 85% (70–79%) β€” heads-up that compaction is approaching
  • ⚠ Compacting soon! (80%+) β€” compaction is imminent, consider /compact now

Note: If you already have a custom statusLine configured, the installer
keeps yours and prints instructions for switching to Hydra's.


πŸ”” Task Completion Sound

Hydra plays a short notification sound when Claude Code finishes a substantial task β€” so you know it's done even if you've tabbed away.

  • Cross-platform β€” macOS (afplay), Windows (PowerShell), Linux (paplay/aplay)
  • Non-blocking β€” the sound plays detached; it never delays Claude's response
  • Smart triggers β€” only fires on substantial tasks (>~10 seconds), not quick replies
  • Controllable β€” /hydra:quiet suppresses it, /hydra:verbose re-enables it

The notification hook is registered automatically during installation.


πŸ”„ Auto-Update Notifications

Hydra checks for updates once per session in the background (never blocks startup).
When a new version is available, you'll see it in the status bar:

πŸ‰ β”‚ Opus β”‚ Ctx: 37% β–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘ β”‚ $0.42 β”‚ my-project β”‚ ⚑ v1.1.0 available

Update with:

# From within Claude Code:
/hydra:update

# Or from your terminal:
npx hail-hydra-cc@latest --global

After updating, restart Claude Code to load the new files.


✨ Features

  • Nine specialized heads β€” Haiku 4.5 (fast) and Sonnet 4.6 (capable) heads for every task type
  • Sentinel integration integrity β€” Two-tier verification (fast scan + deep analysis) catches ~72% of integration bugs before runtime
  • Persistent agent memory β€” Every agent remembers your codebase patterns, conventions, and past decisions across sessions
  • Orchestrator memory β€” Opus maintains its own notes on fragile zones, routing patterns, and known issues via CLAUDE.md
  • Quality-first pipeline β€” Code changes block until sentinel + guard verification completes; nothing reaches you unchecked
  • Auto-Guard β€” hydra-guard (Haiku 4.5) automatically scans code changes for security issues after every hydra-coder run
  • Configurable modes β€” conservative, balanced (default), or aggressive delegation via hydra.config.md
  • Slash commands β€” /hydra:help, /hydra:status, /hydra:update, /hydra:config, /hydra:guard, /hydra:quiet, /hydra:verbose, /hydra:report for full session control
  • Task completion sound β€” plays a notification when Claude finishes substantial tasks
  • Quick commands β€” natural language shortcuts: hydra status, hydra quiet, hydra verbose
  • Custom agent templates β€” Add your own heads using templates/custom-agent.md
  • Session indexing β€” Codebase context persists across turns; no re-exploration on every prompt
  • Speculative pre-dispatch β€” hydra-scout launches in parallel with task classification, saving 2–3 seconds per task
  • Dispatch log β€” Transparent audit trail showing which agents ran, what model, and outcome
  • Codebase Map β€” Persistent dependency graph built by hydra-scout. Maps every file's imports, dependents, risk score, env vars, and test coverage. Enables instant blast-radius lookups for sentinel β€” no more grepping the entire codebase.
  • Risk-Based Verification β€” Files with more dependents get more thorough verification. Critical files always trigger deep sentinel analysis. Low-risk files get fast-tracked.
  • /hydra:map β€” Inspect the dependency map, query blast radius for any file, or force a rebuild

πŸ›‘οΈ Sentinel β€” Integration Integrity

Most bugs don't come from bad code β€” they come from good code that doesn't fit together. A renamed export, a changed return type, a missing dependency after a refactor. These integration issues slip past linters, type-checkers, and even code review because no single file looks wrong.

hydra-sentinel catches them automatically.

How It Works

Code change lands (hydra-coder finishes)
    β”‚
    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  🟒 hydra-sentinel-scan (Haiku 4.5)  β”‚  ← Runs on EVERY code change (~1-2s)
β”‚  Fast sweep: imports, exports,       β”‚
β”‚  signatures, dependencies            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚
          Issues found?
          β”œβ”€β”€ No:  βœ… Pass β€” code proceeds to guard
          β”‚
          └── Yes: Escalate
               β”‚
               β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  πŸ”΅ hydra-sentinel (Sonnet 4.6)      β”‚  ← Only when scan flags issues (~20-30%)
β”‚  Deep analysis: confirms real issues, β”‚
β”‚  dismisses false positives,          β”‚
β”‚  proposes fixes                      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚
          Fix decision:
          β”œβ”€β”€ Trivial (import typo): Auto-fix
          β”œβ”€β”€ Medium (signature mismatch): Offer fix to user
          └── Complex (architectural): Report with context

What Sentinel Catches

Check Type Priority Example
Import/export mismatches P0 Importing a function that was renamed or removed
Function signature changes P0 Caller passes 2 args, function now expects 3
Type contract violations P1 Function returns string but caller expects number
Missing dependency updates P1 New import added but package not in package.json
Cross-file rename gaps P1 Variable renamed in definition but not all call sites
Circular dependency introduction P2 New import creates A β†’ B β†’ C β†’ A cycle
Dead code from refactoring P2 Exported function no longer imported anywhere
Environment/config mismatches P2 Code references env var that isn't in .env.example

Example Output

πŸ›‘οΈ Sentinel Report
──────────────────────────────────────
βœ– P0: src/auth.js imports `validateToken` from src/utils.js
       but src/utils.js now exports `verifyToken` (renamed in this session)
       β†’ Fix: Update import to `verifyToken` [auto-fixable]

⚠ P1: src/api/routes.js calls createUser(name, email)
       but src/models/user.js:createUser now expects (name, email, role)
       β†’ Missing required parameter `role` added in this change

βœ” 6 other integration points verified clean
──────────────────────────────────────

Expected Detection Rates

Category Estimated Detection Rate Notes
Import/export mismatches ~95% Direct string matching
Signature mismatches ~80% Requires type inference
Type contract violations ~60% Limited without full type system
Missing dependencies ~90% Package.json diffing
Cross-file rename gaps ~70% Heuristic-based
Circular dependencies ~85% Import graph traversal
Dead code ~50% Conservative β€” flags only obvious cases
Config mismatches ~40% Pattern matching on env references
Weighted average ~72% Based on typical issue distribution

Memory makes it better over time. Sentinel remembers past false positives and known fragile
integration points in your project. The more you use it, the more accurate it gets.


πŸ—ΊοΈ Codebase Map

New in v2.1.0 β€” Hydra builds a persistent dependency map of your codebase,
giving every agent instant access to file relationships without scanning.

How It Works

hydra-scout builds the map on first run by extracting import statements
from every source file using grep (no external parsers required). The map
is stored at .claude/hydra/codebase-map.json.

Session 1: scout builds the full map (~10 seconds for 500 files)
Session 2: scout checks git hash β†’ nothing changed β†’ skip rebuild (instant)
Session 3: scout checks git hash β†’ 3 files changed β†’ update only those 3

What the Map Contains

Data How It's Used
File imports "auth.ts imports user.ts and env.ts"
Reverse imports "auth.ts is imported by users.ts, admin.ts, middleware.ts"
Risk score low (0-1 deps) β†’ medium (2-3) β†’ high (4-6) β†’ critical (7+)
Env var index "JWT_SECRET is used in auth.ts and middleware.ts"
Test coverage covered / partial / untested per file
Git staleness Hash comparison for instant freshness check

Why This Matters

Without map β€” When auth.ts changes, sentinel greps the ENTIRE codebase
looking for files that import it. In a 500-file project, that's 500 file reads.
Takes 5-15 seconds, costs 3,000-8,000 tokens.

With map β€” Sentinel reads the JSON, looks up auth.ts's imported_by array,
gets [users.ts, admin.ts, middleware.ts] instantly. Reads only those 3 files.
Takes <2 seconds, costs 500-1,500 tokens.

Savings: 3-5Γ— faster, 3-5Γ— fewer tokens per sentinel scan.

Risk-Based Sentinel Triggering

The map's risk scores let Opus make smarter verification decisions:

Modified File Risk What Happens
πŸ”΄ Critical (7+ deps) Sentinel-scan + deep analysis (always)
🟠 High (4-6 deps) Sentinel-scan, escalate if issues found
🟑 Medium (2-3 deps) Sentinel-scan, escalate only for P0 issues
🟒 Low (0-1 deps) Sentinel-scan, auto-accept if clean

This means Hydra spends more verification effort where it matters most
(high-risk files) and less where it doesn't (isolated utilities).

Inspect the Map

/hydra:map                       # Show summary β€” risk distribution, coverage stats
/hydra:map src/services/auth.ts  # Show blast radius for a specific file
/hydra:map rebuild               # Force a complete rebuild

Technical Notes

  • The map is built using grep + regex β€” no Tree-sitter, no AST parsing, no
    external dependencies. Works with JS/TS, Python, Go, Java, Kotlin, Ruby, Rust.
  • Supports relative import resolution (e.g., './auth' β†’ src/services/auth.ts)
  • Falls back gracefully β€” if the map doesn't exist, all agents use their
    original grep-based behavior. The map is an optimization, not a requirement.
  • Stored at .claude/hydra/codebase-map.json β€” add to .gitignore (machine-generated).

🧠 Agent Memory

Without memory, every session starts cold. Agents re-discover your conventions, re-learn your
project structure, and repeat the same questions. With memory, knowledge compounds.

Aspect Without Memory With Memory
First task Agent explores from scratch Agent recalls project patterns
Conventions May use wrong style Remembers your naming, structure, patterns
Known issues No awareness of past bugs Recalls fragile areas and past fixes
Routing accuracy Generic classification Improved by past dispatch outcomes
False positives Same false alarms repeat Sentinel suppresses known non-issues

How It Works

Every agent has memory: project in its frontmatter. Claude Code automatically manages a
per-project memory directory (.claude/memory/) where agents store and retrieve learnings.

Agent What It Remembers
hydra-scout Project structure patterns, key file locations, search shortcuts
hydra-runner Test commands, common failure patterns, build quirks
hydra-scribe Documentation style, preferred formats, terminology
hydra-guard Known false positives, project-specific security patterns
hydra-git Commit conventions, branch naming, merge preferences
hydra-sentinel-scan Known fragile integration points, past false positives
hydra-coder Coding style, architecture patterns, preferred libraries
hydra-analyst Common bug patterns, performance hotspots, review focus areas
hydra-sentinel Integration history, confirmed vs dismissed findings
Orchestrator (Opus) Fragile zones, routing accuracy, escalation patterns (via CLAUDE.md Hydra Notes)

Memory Properties

  • Automatic β€” agents read and write memory without any user action
  • Project-scoped β€” each project has its own memory; no cross-contamination
  • Persistent β€” survives across sessions; compounds over time
  • Manageable β€” stored as plain markdown in .claude/memory/; edit or delete anytime

The Compound Effect

Session 1: Agents learn your project. Session 5: They know your conventions. Session 20: They
anticipate your patterns. The framework gets more efficient the more you use it β€” not because
the models improve, but because context quality improves.


πŸ€” Why I Built This

After Opus 4.6 dropped, I noticed something frustrating β€” code execution felt slowww. Reallyyy Slow. Not because the model was worse, but because I was feeding everything through one massive model. Every file read, every grep, every test run, every docstring β€” all burning through Opus-tier tokens. The result? Frequent context compaction, more hallucinations, and an API bill that made me wince.

So I started experimenting. I switched to Haiku for the simple stuff β€” running commands, tool calls, file exploration. Sonnet for code generation, refactoring, reviews. And kept Opus only for what it's actually good at: planning, architecture, and the hard decisions. The result surprised me. Same code quality. Sometimes better β€” because each model was operating within a focused context window instead of one overloaded one.

Five agents. Five separate context windows. Each with a clearly defined job. They do the work, and only pass results back to the brain β€” Opus. The outcome:

  • Longer coding sessions (less compaction, less context blowup)
  • Drastically reduced API costs (Haiku 4.5 is 5Γ— cheaper than Opus 4.6)
  • Faster execution (Haiku 4.5 responds ~10Γ— faster)
  • Same or better code quality (focused context > bloated context)
  • Zero manual model switching (this is the big one)

Because that was the real pain β€” manually switching between models for every task tier to save costs. Every. Single. Time. So I built a framework that does it for me. And honestly? It does it better than I did. That was hard to admit, but here we are.

I also didn't want it to be boring. So I gave it teeth, heads, and a battle cry. If you prefer something more buttoned-up, the spec-exec branch has the same framework with zero theatrics.

Hail Hydra. Have fun.


πŸ’‘ The Theory (for nerds)

Speculative decoding (Chen et al., 2023) accelerates LLM inference by having a small draft model propose tokens that a large target model verifies in parallel. Since verifying K tokens costs roughly the same as generating 1 token, you get 2–2.5Γ— speedup with zero quality loss.

Hydra applies this at the task level:

                          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                          β”‚  SPECULATIVE DECODING (tokens)  β”‚
                          β”‚                                 β”‚
                          β”‚  Small model drafts K tokens    β”‚
                          β”‚  Big model verifies in parallel β”‚
                          β”‚  Accept or reject + resample    β”‚
                          β”‚  Result: 2-2.5Γ— speedup         β”‚
                          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                        β”‚
                                   Same idea,
                                  bigger scale
                                        β”‚
                                        β–Ό
                          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                          β”‚  πŸ‰ HYDRA (tasks)               β”‚
                          β”‚                                 β”‚
                          β”‚  Haiku/Sonnet drafts the task   β”‚
                          β”‚  Opus verifies (quick glance)   β”‚
                          β”‚  Accept or redo yourself        β”‚
                          β”‚  Result: 2-3Γ— speedup           β”‚
                          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

The math is simple: if 70% of tasks can be handled by Haiku 4.5 (10Γ— faster, 5Γ— cheaper) and 20% by Sonnet 4.6 (3Γ— faster, ~1.7Γ— cheaper), your effective speed and cost improve dramatically β€” even accounting for the occasional rejection.


πŸ—οΈ Architecture

User Request
    β”‚
    β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚                                                      β”‚
    β–Ό                                                      β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”            β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  🧠 ORCHESTRATOR (Opus)     β”‚            β”‚  🟒 hydra-scout (Haiku 4.5)  β”‚
β”‚  Classifies task            β”‚            β”‚  IMMEDIATE pre-dispatch:      β”‚
β”‚  Plans waves                β”‚            β”‚  "Find files relevant to      β”‚
β”‚  Decides blocking / not     β”‚            β”‚   [user's request]"           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜            β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚         (unless Session Index already covers)  β”‚
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                β”‚ (scout + classification both ready)
                      [Session Index updated]
                                β”‚
    ════════════════════════════════════════════════════════
    Wave N  (parallel dispatch, index context injected)
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚  SEQUENTIAL       β”‚  PARALLEL (wait for all)         β”‚
    β–Ό                   β–Ό                                  β”‚
 [coder]            [scribe] β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
    β”‚
    β–Ό
 ALL agents complete (Opus waits for every dispatched agent)
    β”‚
    β”œβ”€β”€ Raw data / clean pass? β†’ AUTO-ACCEPT β†’ (updates Session Index if scout)
    └── Code / analysis / user-facing docs? β†’ Orchestrator verifies
         β”‚
         β–Ό
   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚  πŸ›‘οΈ QUALITY GATE (blocks until complete)            β”‚
   β”‚                                                     β”‚
   β”‚  🟒 sentinel-scan (Haiku) β€” fast integration sweep  β”‚
   β”‚       └── issues? β†’ πŸ”΅ sentinel (Sonnet) β€” deep     β”‚
   β”‚  🟒 guard (Haiku) β€” security/quality scan           β”‚
   β”‚                                                     β”‚
   β”‚  Both must pass before result reaches user           β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          β”‚
                          β–Ό
   User gets result (single response, all agent outputs included)

🐲 The Nine Heads

Head Model Speed Role Personality
hydra-scout (Haiku 4.5) 🟒 Haiku 4.5 ⚑⚑⚑ Codebase exploration, file search, reading "I've already found it."
hydra-runner (Haiku 4.5) 🟒 Haiku 4.5 ⚑⚑⚑ Test execution, builds, linting, validation "47 passed, 3 failed. Here's why."
hydra-scribe (Haiku 4.5) 🟒 Haiku 4.5 ⚑⚑⚑ Documentation, READMEs, comments "Documented before you finished asking."
hydra-guard (Haiku 4.5) 🟒 Haiku 4.5 ⚑⚑⚑ Security/quality gate after code changes "No secrets. No injection. You're clean."
hydra-git (Haiku 4.5) 🟒 Haiku 4.5 ⚑⚑⚑ Git: commit, branch, diff, stash, log "Committed. Conventional message. Clean diff."
hydra-sentinel-scan (Haiku 4.5) 🟒 Haiku 4.5 ⚑⚑⚑ Fast integration sweep after code changes "Imports check out. Signatures match. Clean."
hydra-coder (Sonnet 4.6) πŸ”΅ Sonnet 4.6 ⚑⚑ Code implementation, refactoring, features "Feature's done. Tests pass."
hydra-analyst (Sonnet 4.6) πŸ”΅ Sonnet 4.6 ⚑⚑ Code review, debugging, analysis "Found 2 critical bugs and an N+1 query."
hydra-sentinel (Sonnet 4.6) πŸ”΅ Sonnet 4.6 ⚑⚑ Deep integration analysis (when scan flags issues) "2 real issues confirmed. 1 false positive dismissed."

Task Routing Cheat Sheet

Is it read-only? ─── Yes ──→ Finding files?
    β”‚                           β”œβ”€β”€ Yes: hydra-scout (Haiku 4.5) 🟒
    β”‚                           └── No:  hydra-analyst (Sonnet 4.6) πŸ”΅
    β”‚
    No ──→ Is it a git operation? ─── Yes ──→ hydra-git (Haiku 4.5) 🟒
    β”‚
    No ──→ Is it a security scan? ─── Yes ──→ hydra-guard (Haiku 4.5) 🟒
    β”‚
    No ──→ Just running a command? ─── Yes ──→ hydra-runner (Haiku 4.5) 🟒
    β”‚
    No ──→ Writing docs only? ─── Yes ──→ hydra-scribe (Haiku 4.5) 🟒
    β”‚
    No ──→ Clear implementation approach? ─── Yes ──→ hydra-coder (Sonnet 4.6) πŸ”΅
    β”‚
    No ──→ Needs deep reasoning? ─── Yes ──→ 🧠 Opus 4.6 (handle it yourself)

    Code was just changed? ─── Yes ──→ hydra-sentinel-scan (Haiku 4.5) 🟒
        β”‚                                   β”‚
        β”‚                              Issues found?
        β”‚                              β”œβ”€β”€ No:  Done βœ…
        β”‚                              └── Yes: hydra-sentinel (Sonnet 4.6) πŸ”΅

βš™οΈ Configuration

Customize Hydra's behavior with an optional config file:

# Create a default config (user-level β€” applies to all projects)
./scripts/install.sh --config

Then edit ~/.claude/skills/hydra/config/hydra.config.md:

mode: balanced          # conservative | balanced (default) | aggressive
dispatch_log: on        # on (default) | off | verbose
auto_guard: on          # on (default) | off

Project-level config (overrides user-level):
Place at .claude/skills/hydra/config/hydra.config.md in your project root.

See config/hydra.config.md for the full reference with all options.


🧩 Extending Hydra

Add your own specialized head in three steps:

1. Copy the template:

cp templates/custom-agent.md agents/hydra-myspecialist.md

2. Customize the agent β€” edit the name, description, tools, and instructions.

3. Deploy it:

./scripts/install.sh --user   # or --project

Your new head is now discoverable by Claude Code alongside the built-in nine.
See templates/custom-agent.md for the full template with
instructions on writing effective agent descriptions, output formats, and collaboration protocols.


πŸ“‚ Repository Structure

hydra/
β”œβ”€β”€ πŸ“„ SKILL.md                          # Core framework instructions
β”œβ”€β”€ 🐲 agents/
β”‚   β”œβ”€β”€ hydra-scout.md                   # 🟒 Codebase explorer
β”‚   β”œβ”€β”€ hydra-runner.md                  # 🟒 Test & build executor
β”‚   β”œβ”€β”€ hydra-scribe.md                  # 🟒 Documentation writer
β”‚   β”œβ”€β”€ hydra-guard.md                   # 🟒 Security/quality gate
β”‚   β”œβ”€β”€ hydra-git.md                     # 🟒 Git operations
β”‚   β”œβ”€β”€ hydra-sentinel-scan.md           # 🟒 Fast integration sweep
β”‚   β”œβ”€β”€ hydra-coder.md                   # πŸ”΅ Code implementer
β”‚   β”œβ”€β”€ hydra-analyst.md                 # πŸ”΅ Code reviewer & debugger
β”‚   └── hydra-sentinel.md               # πŸ”΅ Deep integration analysis
β”œβ”€β”€ πŸ“š references/
β”‚   β”œβ”€β”€ routing-guide.md                 # 30+ task classification examples
β”‚   └── model-capabilities.md            # What each model excels at
β”œβ”€β”€ βš™οΈ config/
β”‚   └── hydra.config.md                  # User configuration template
β”œβ”€β”€ πŸ“‹ templates/
β”‚   └── custom-agent.md                  # Template for adding your own heads
└── πŸ”§ scripts/
    └── install.sh                       # One-command deployment

πŸ“Š Expected Impact

Metric Without Hydra With Hydra Improvement
Task Speed 1Γ— (Opus for everything) 2–3Γ— faster 🟒 Haiku 4.5 heads respond ~10Γ— faster
API Cost 1Γ— (Opus 4.6 for everything) ~0.5Γ— ~50% cheaper
Quality Opus-level Opus-level Zero degradation
User Experience Normal Normal Invisible β€” zero friction
Overhead per turn (Turn 2+) Full re-exploration each turn Session index reused 🟒 2-4s saved per turn
Scout/runner verification Opus reviews every output Auto-accepted for factual data 🟒 ~50-60% of outputs skip review
Integration bugs caught 0% (no verification) ~72% caught before runtime 🟒 Sentinel auto-verification
Session knowledge Starts cold every time Compounds across sessions 🟒 Persistent agent memory
Sentinel scan speed 5-15 seconds (grep) <2 seconds (map lookup) 🟒 3-5Γ— faster with codebase map
Sentinel scan tokens 3,000-8,000 per scan 500-1,500 per scan 🟒 3-5Γ— fewer tokens per scan

How the Savings Work

Task Type % of Work Model Used Input Cost vs Opus 4.6 Output Cost vs Opus 4.6
Exploration, search, tests, docs ~50% 🟒 Haiku 4.5 20% ($1 vs $5/MTok) 20% ($5 vs $25/MTok)
Implementation, review, debugging ~30% πŸ”΅ Sonnet 4.6 60% ($3 vs $5/MTok) 60% ($15 vs $25/MTok)
Architecture, hard problems ~20% 🧠 Opus 4.6 100% (no change) 100% (no change)
Sentinel scan (fast) Auto (every code change) 🟒 Haiku 4.5 20% 20%
Sentinel deep (conditional) ~20-30% of code changes πŸ”΅ Sonnet 4.6 60% 60%
Blended effective cost ~48% of all-Opus ~48% of all-Opus

Note: Blended input = (0.5Γ—$1 + 0.3Γ—$3 + 0.2Γ—$5) / $5 = $2.40/$5 β‰ˆ 48%.
Rounded to ~50% blended cost reduction overall.
Savings calculated against Opus 4.6 ($5/$25 per MTok) as of February 2026.

Measure Your Savings

The most accurate way to measure Hydra's impact β€” no estimation, real numbers:

  1. Start a Claude Code session without Hydra installed
  2. Complete a representative coding task
  3. Note the session cost from Claude Code's cost display
  4. Start a new session with Hydra installed
  5. Complete a similar task
  6. Compare the two costs

That's it. Real data beats theoretical calculations every time.

What to expect (based on February 2026 API pricing)

With a typical task distribution (50% Haiku 4.5, 30% Sonnet 4.6, 20% Opus 4.6):

  • Input tokens: ~52% cheaper ($2.40 vs $5.00 per MTok)
  • Output tokens: ~52% cheaper ($12.00 vs $25.00 per MTok)
  • Blended: ~50% cost reduction
  • Speed: 2–3Γ— faster on delegated tasks

Note: Savings calculated against Opus 4.6 pricing ($5/$25 per MTok) as of February 2026.
Savings would be significantly higher compared to Opus 4.1/4.0 ($15/$75 per MTok).

Additional Savings from Codebase Map (v2.1.0+)

The codebase map provides additional token savings on TOP of the model-routing
savings above:

Operation Without Map With Map Savings
Sentinel scan (per change) 3,000-8,000 tokens 500-1,500 tokens ~3-5Γ—
Scout exploration (repeat session) 5,000-15,000 tokens 1,000-3,000 tokens ~3-5Γ—
Blast radius computation Grep entire codebase JSON lookup Instant

These savings compound with every code change in a session. In a session with
5 code changes, the map saves roughly 10,000-30,000 tokens on sentinel scans alone.


🎯 Design Principles

πŸ«₯ Invisibility

The user should never notice Hydra operating. No announcements, no permission requests, no process narration. If a head does the work, present the output as if Opus did it.

⚑ Speed Over Ceremony

Don't overthink classification. Quick mental check: "Haiku? Sonnet? Me?" and go. If you spend 10 seconds classifying a 5-second task, you've defeated the purpose.

πŸ”€ Parallel Heads

Independent subtasks launch in parallel. "Fix the bug AND add tests" β†’ two heads working simultaneously.

⬆️ Escalate, Never Downgrade

If a head's output isn't good enough, Opus does it directly. No retries at the same tier. This mirrors speculative decoding's rejection sampling β€” when a draft token is rejected, the target model samples directly.


πŸ”¬ The Speculative Decoding Connection

For those who want to go deeper, here's how Hydra maps to the original speculative decoding concepts:

Speculative Decoding Concept Hydra Equivalent
Target model (large) 🧠 Opus 4.6 β€” the orchestrator
Draft model (small) 🟒 Haiku / πŸ”΅ Sonnet heads
Draft K tokens Heads draft the full task output
Parallel verification Opus glances at the output
Modified rejection sampling Accept β†’ ship it. Reject β†’ Opus redoes it.
Acceptance rate (~70-90%) Target: 85%+ of delegated tasks accepted as-is
Guaranteed β‰₯1 token per loop Every task produces a result β€” Opus catches failures
Temperature/nucleus compatibility Works with any coding task type or domain

Key Papers


πŸ€” FAQ

Will I notice any quality difference?
No. Hydra only delegates tasks that are within each model's capability band. If there's any doubt, the task stays with Opus. And Opus always verifies β€” if a head's output isn't up to standard, Opus redoes it before you ever see it. Is this actually speculative decoding?
Not at the token level β€” that happens inside Anthropic's servers and we can't modify it. Hydra applies the same philosophy at the task level: draft with a fast model, verify with the powerful model, accept or reject. Same goals (speed + cost), same guarantees (zero quality loss), different granularity. What if I'm not using Opus?
Hydra is designed for the Opus-as-orchestrator pattern, but the principles apply at any tier. If you're running Sonnet as your main model, you could adjust the heads to use Haiku for everything delegatable. Can I customize which models the heads use?
Absolutely. Each head is a simple Markdown file with a model: field in the frontmatter. Change model: haiku to model: sonnet (or any supported model) and you're done. Do the heads work with subagents I already have?
Yes. Hydra heads coexist with any other subagents. Claude Code discovers all agents in the .claude/agents/ directories. No conflicts. How do I uninstall?

Removes all agents, commands, hooks, and cache files. Deregisters hooks from
~/.claude/settings.json. Your other Claude Code configuration is preserved.

./scripts/install.sh --uninstall
# or: npx hail-hydra-cc --uninstall
What is Sentinel and how does it work?
Sentinel is a two-tier integration verification system. After every code change, hydra-sentinel-scan (Haiku 4.5) runs a fast sweep (~1-2s) checking imports, exports, function signatures, and dependencies. If it finds potential issues, hydra-sentinel (Sonnet 4.6) performs deep analysis to confirm real problems and dismiss false positives. The result is ~72% of integration bugs caught before they reach you. Does Sentinel slow things down?
The fast scan adds ~1-2 seconds per code change. The deep analysis only triggers when the scan flags issues (~20-30% of changes), adding another ~3-5 seconds in those cases. For the ~70-80% of changes that are clean, you'll barely notice it. The time saved debugging integration issues far outweighs the scan overhead. Will Sentinel auto-fix things without asking?
Only trivial fixes (like updating an import path after a rename). For medium-complexity fixes (signature mismatches), it offers the fix for your approval. For complex architectural issues, it reports the problem with context but doesn't attempt a fix. You stay in control. Can I disable Sentinel?
Yes. Set sentinel: off in your hydra.config.md. You can also set sentinel: scan-only to keep the fast sweep but skip deep analysis. The default is on (both tiers). Does Agent Memory use extra tokens?
Memory is loaded as part of each agent's context when it starts, so it does use some tokens β€” but agent memory files are small (typically a few hundred tokens each). The improved accuracy from having project context usually saves tokens by reducing re-exploration and misclassification. Where is agent memory stored?
In .claude/memory/ within your project directory. Each agent stores its own memory as plain markdown files. You can read, edit, or delete them anytime. Memory is project-scoped β€” each project has its own memory, no cross-contamination. Does Opus (the orchestrator) also have memory?
Yes. Opus maintains a "Hydra Notes" section in your project's CLAUDE.md file. This includes fragile integration zones, routing accuracy observations, and known issues. Unlike agent memory (which is per-agent), orchestrator memory is visible to all agents and informs dispatch decisions. What is the Codebase Map?
A persistent JSON file that maps every file's imports, dependents, risk score, env var references, and test coverage. Built by hydra-scout using grep (no external parsers). Stored at .claude/hydra/codebase-map.json. Enables instant blast-radius lookups for sentinel instead of scanning the entire codebase. Do I need to build the map manually?
No. hydra-scout builds it automatically the first time it's dispatched for exploration. After that, it updates incrementally (only changed files) using git hash comparison. You can force a rebuild with /hydra:map rebuild or inspect it with /hydra:map. Does the map work with my language?
The map extracts imports using grep patterns for JavaScript, TypeScript, Python, Go, Java, Kotlin, Ruby, and Rust. If your language isn't supported, agents fall back to their original grep-based behavior β€” the map is an optimization, not a requirement. More languages can be added in future versions. How big is the map file?
For a 500-file project, the map is typically 50-150KB. For a 5,000-file project, it's around 500KB-1.5MB. It's a single JSON file β€” no database, no external services, nothing to maintain.

πŸ’¬ Feedback

Found a bug? Have a feature idea? Want to share feedback?

From within Claude Code:

/hydra:report

Or directly on GitHub:


🀝 Contributing

Found a task type that gets misclassified? Have an idea for a new head? Contributions are welcome!

  1. Fork it
  2. Create your branch (git checkout -b feature/hydra-new-head)
  3. Commit (git commit -m 'Add hydra-optimizer head for perf tuning')
  4. Push (git push origin feature/hydra-new-head)
  5. Open a PR

πŸ“œ License

MIT β€” Use it, fork it, deploy it. Just don't use it for world domination.

...unless it's code world domination. Then go ahead.



Hail Hydra

Built with 🧠 by Claude Opus 4.6 β€” ironically, the model this framework is designed to use less of.
v2.0.4 β€” Now with memory, integration integrity, and task notifications.


Prefer a clean, technical version? See the spec-exec branch β€” same framework, zero theatrics.

Reviews (0)

No results found