ultracost
Health Warn
- License — License: MIT
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Low visibility — Only 6 GitHub stars
Code Pass
- Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Pass
- Permissions — No dangerous permissions requested
No AI report is available for this listing yet.
Cut Claude Code dynamic-workflow costs: pin the right model + effort on every subagent stage, estimate spend before launch, and block scripts that would silently run every stage on Opus.
ultracost
Per-stage model routing for Claude Code dynamic workflows.
Stop a single ultracode fan-out from running 40 subagents on Opus by accident.
About
ultracost keeps Claude Code's ultracode mode from silently running every
subagent on Opus. When ultracode is on, the session is pinned to Opus @ xhigh
and a single dynamic workflow fans out to dozens of subagents that inherit that
session model unless every stage is pinned. ultracost makes the per-stage routing
explicit, injects the policy at the start of every session, and ships a guard that
fails any unpinned stage.
Built for
ultracode(Opus @xhighdynamic workflows) — that is the only place
the multi-agent fan-out it guards against happens. ultracost routes by tier
(opus/sonnet), not a pinned version, so it tracks whatever Opus your session runs.
No telemetry. No network on the hot path. MIT.
Security & trust. ultracost has zero runtime and dev dependencies, so there is no
supply chain to compromise — Snyk Open Source and npm audit report 0 vulnerabilities.
Releases publish to npm with OIDC Trusted Publishing and signed provenance, every
GitHub Action is pinned to a commit SHA, and CodeQL + OpenSSF Scorecard run in CI.
The installer touches only its own files and is fully reversible. See SECURITY.md.
Setup (Claude Code plugin):
/plugin marketplace add danielkremen818/ultracost
/plugin install ultracost@ultracost
Or via the npm CLI (CI / scripting):
npx ultracost init
First command (in Claude Code): /ultracost:check ./path/to/workflow.js — flag any agent() stage that would silently inherit Opus. The plugin ships a slash command for every verb (/ultracost:check · estimate · explain · simulate · diff · audit · usage · reconcile · calibrate · ledger · status) — no global binary needed.
Same verbs on the npm CLI (CI / scripting): ultracost init · check · audit · estimate · explain · simulate · diff · usage · reconcile · calibrate · ledger · pricing · status · doctor · uninstall.
The problem
When ultracode is on, Claude Code runs the session on Opus @ xhigh (Opus is the only model that supports xhigh) and auto-orchestrates dynamic workflows that fan out to dozens — up to 1,000 — subagents. Two defaults compound:
- Subagents inherit the session model. No per-stage override → every stage runs on the session's Opus model.
- The built-in workflow guidance tells Claude to omit the per-agent model. So inheritance wins.
The documented result: one prompt spawning 46 Opus subagents and ~3M tokens with no warning. A grep sweep and a per-file verifier do not need Opus.
The evidence: nobody pins a stage
This is the default behavior, not user error. In a scan of ~22 real ultracode workflow scripts authored across ~/.claude/projects/**/workflows/scripts/, almost none pinned model: on any stage — every stage inherited the session model (Opus @ xhigh). Even Anthropic's own bundled deep-research workflow pins zero stages. Left to its defaults, Claude Code writes fan-outs that silently run everything on the most expensive model.
You can reproduce this on your own history in one command:
npx ultracost audit ~/.claude/projects
What ultracost does
ultracost makes the routing explicit, policy-driven, and verifiable — without giving up quality on the work that matters.
- A quality-first policy. Coding and reasoning stay on Opus @ xhigh. Pre-planned mechanical work and search/collection drop to Sonnet. Haiku is never used. You own the policy in one JSON file.
- Always-on routing guidance. As a plugin, a
SessionStarthook injects the policy as context at the start of every session (and re-injects it after compaction) — so it is present when Claude authors a workflow, without relying on the model choosing to open a skill. As the npm CLI, the same policy compiles into your~/.claude/CLAUDE.md. A routing skill ships alongside for explicit/-reference. - The Workflow Guard. A static analyzer that scans the workflow scripts Claude authors and flags any
agent()stage missing amodel:pin — so a fan-out can't silently inherit Opus. Run it by hand, via/ultracost:check, or in CI. No other tool does this.
Architecture
One shared core in src/, two delivery surfaces: a Claude Code plugin (primary) and an npm CLI (secondary). Both compile from the same policy.json.
The plan lives in data (policy.json), not in prose buried in a prompt. The guard is the enforcement layer the model can't talk its way out of. See docs/architecture.md for the full picture.
Install
Plugin (recommended)
Inside Claude Code:
/plugin marketplace add danielkremen818/ultracost
/plugin install ultracost@ultracost
Then, without leaving Claude Code, drive everything through slash commands — verify the workflow Claude just drafted, estimate it, or reconcile a finished run against what it actually cost:
/ultracost:check ./path/to/workflow.js
The plugin bundles — touching none of your own files — a SessionStart policy-injection hook, a PreToolUse cost gate on the Workflow tool (ULTRACOST_GATE=off to disable), a routing-policy skill, and a slash command for every verb (each runs the bundled CLI via ${CLAUDE_PLUGIN_ROOT}, so there's nothing to install on PATH):
| Command | What it does |
|---|---|
/ultracost:check [path] |
Flag agent() stages that don't pin a model (or pin the wrong tier). Defaults to the most recent workflow script. |
/ultracost:estimate <script> |
Agent count, model mix, tiered cost vs an all-opus baseline. |
/ultracost:explain <script> |
Per-stage rationale: model, effort, the tier the prompt reads like, est cost, check flags. |
/ultracost:simulate <script> |
Cost under all-opus vs your tiered pins vs all-sonnet. |
/ultracost:diff <a> <b> |
Cost delta between two versions of a script. |
/ultracost:audit [dir] |
Pin stats across your real workflow scripts. |
/ultracost:usage [dir] |
Real token cost from local transcripts (main vs subagents vs stages). |
/ultracost:reconcile [--last|<id>] |
Estimate vs actual per stage for a finished run. |
/ultracost:calibrate |
Tune the estimator from your real token usage. |
/ultracost:ledger |
Cumulative savings vs all-opus across recorded runs. |
/ultracost:status |
How ultracost is delivered (plugin/cli), the policy, and the bypass caveat. |
Requires Claude Code with the /plugin command and dynamic workflows enabled.
npm CLI
npx ultracost init
This writes ~/.claude/ultracost/policy.json, injects the routing block into ~/.claude/CLAUDE.md, installs the re-inject hook (~/.claude/ultracost/reinject.mjs), and registers it on SessionStart in ~/.claude/settings.json. New sessions pick it up immediately. Paths honor CLAUDE_CONFIG_DIR if you've relocated your config. Requires Node ≥ 24.
Then verify a workflow script at any time:
ultracost check ./path/to/workflow.js
Use the npm path for CI/scripting or the CLAUDE.md-injection workflow; for day-to-day use in Claude Code, the plugin above is simpler.
Uninstall
Plugin
/plugin uninstall ultracost@ultracost
/plugin marketplace remove ultracost
The plugin touches none of your own files, so removing it removes everything ultracost added.
npm CLI
ultracost uninstall
Reverses everything init did: removes the routing block from ~/.claude/CLAUDE.md, deletes ~/.claude/ultracost/, and unregisters the hook from ~/.claude/settings.json (an invalid settings.json is reported, never overwritten).
Quickstart (npm CLI)
Inside Claude Code, every verb below is also a slash command —
/ultracost:estimate,/ultracost:reconcile, etc. (see the table above). This section is the npm path for CI, scripting, or the CLAUDE.md-injection workflow.
ultracost init # install policy + rules + hook (refuses if the plugin already delivers it)
ultracost status # active policy + how it's delivered (plugin/cli) + bypass caveat
ultracost audit ~/.claude/projects # pin stats across your real workflow scripts
ultracost check ./path/to/workflow # scan a workflow script (or a directory)
ultracost check . --fix # auto-pin the default model on unpinned stages
ultracost estimate ./workflow.js # agents, model mix, and cost vs all-opus baseline
ultracost explain ./workflow.js # per-stage rationale + which checks fire
ultracost reconcile --last # estimate vs actual for your latest real run
ultracost calibrate # tune the estimator from your real token usage
ultracost ledger # cumulative savings vs all-opus
ultracost pricing refresh # update prices from Anthropic's official page
Point check at the script Claude wrote (its path is printed when a run starts, under ~/.claude/projects/), or wire it into CI.
Cost estimate + dynamic effort + pre-flight gate
Beyond routing, ultracost estimates a workflow's cost before it runs, has Claude pick a per-stage effort level (low to xhigh), and gates the launch so you can approve, cancel, or restructure it.
$ ultracost estimate ./workflow.js
agents 4 fixed + 1 fan-out group(s) x ~5 = ~9
model mix 3x opus, 6x sonnet
baseline (all opus) $0.9000
tiered (ultracost) $0.5304
savings $0.3696 (41%)
- Pricing is official-sourced. Prices live in
policy.jsonwith a_sourceURL and_asOfdate;ultracost pricing refreshre-fetches Anthropic's official pricing page and updates them. The estimate itself runs offline (no network on the hot path). - Dynamic effort. Each stage gets the lowest effort that fits (
low/medium/high/xhigh), bounded by model (sonnetup tohigh,opusup toxhigh). Effort feeds the estimate. - Pre-flight gate (on by default, hard in every mode). The plugin ships a deterministic
PreToolUsehook on theWorkflowtool that hard-stops every dynamic-workflow launch — it runs the guard + estimate and leads with⚠ N/M stage(s) NOT pinned -> will inherit Opuswhen stages are unpinned, so an accidental all-Opus fan-out can't slip by. It is mode-aware: it asks (with the estimate) indefault/acceptEdits/auto, and auto-denies an unpinned workflow inbypassPermissions/dontAskwhere an ask wouldn't pause — so it holds in every permission mode, not just when the model chooses to ask.ULTRACOST_GATE=strictdenies on any problem everywhere;=asknever escalates;=offdisables it (headless/CI). On top of that, the policy has Claude runultracost estimateand offer the Approve / Cancel / Modify menu viaAskUserQuestion.
Estimates are relative (tiered vs all-opus), not a bill; fan-outs are ranges; the interactive 3-option menu needs a TUI. Full detail, assumptions, and the gate's #52343 limitation are in docs/ESTIMATES.md.
The closed loop: measure, reconcile, calibrate
ultracost doesn't just estimate — it reads its own results back and tunes itself. It parses your local Claude Code transcripts (offline; no network, no telemetry) and attributes tokens per dynamic-workflow stage via the subagents/workflows/wf_*/agent-*.jsonl files Claude Code writes. No other router does this.
ultracost usage # real token cost: main loop vs subagents vs workflow stages
ultracost reconcile --last # estimate vs ACTUAL, per stage, for your latest workflow run
ultracost calibrate # learn a token prior from your real runs (estimate uses it)
ultracost ledger # cumulative $ saved vs an all-opus baseline, persisted
In Claude Code these are /ultracost:usage, /ultracost:reconcile, /ultracost:calibrate, and /ultracost:ledger — same output, no CLI install.
- Self-calibrating.
calibratelearns real per-stage token sizes (outlier-filtered) into~/.claude/ultracost/calibration.json;estimate,explain,simulate, and the gate use it automatically — the estimate gets closer to your reality every run. - Savings ledger.
ledgerkeeps a running tally of what the policy saved you versus running everything on Opus, persisted in~/.claude/ultracost/ledger.jsonl(idempotent per run). - Pre-flight budget guard. Set
budget.perRun/budget.perDayin the policy and the cost gate denies a launch whose estimate would blow the cap — before it runs.
Understand and compare a workflow
ultracost explain ./wf.js # per-stage: tier, effort, est cost, and which UC checks fire
ultracost simulate ./wf.js # cost under all-opus vs your tiered pins vs all-sonnet
ultracost diff old.js new.js # cost delta between two versions (--ci → PR-comment table)
Or /ultracost:explain, /ultracost:simulate, /ultracost:diff inside Claude Code.
How routing is decided
| Tier | Model | Use for |
|---|---|---|
| opus | claude-opus-4-8 @ xhigh |
writing/refactoring/debugging code, design & architecture, security/perf, tests that need judgment, planning, synthesis |
| sonnet | claude-sonnet-4-6 @ high |
applying a decided edit across files, search/grep, running tests, git ops, docs, gathering context |
Decision rule: if the stage must decide how to change code → opus. If the how is already planned and it just executes → sonnet. When in doubt → opus. Never haiku.
This is opinionated and quality-first by design. If you want a cost-first split, edit the policy (below).
The Workflow Guard
$ ultracost check ./wf.js
wf.js:2:15 UC001 stage has no options object — add { model: ... } so it does not inherit the session model
wf.js:3:14 UC002 stage options object has no model — will inherit the session model
wf.js:4:13 UC003 stage pins banned model "haiku" (policy.neverUse)
3 error(s), 0 warning(s) in 1 file(s).
| Code | Meaning |
|---|---|
UC001 |
agent(x) with no options object |
UC002 |
options object present, no model |
UC003 |
model resolves to a banned model (e.g. haiku) |
UC004 |
model: 'inherit' while allowInherit is false |
UC005 |
model/options is a dynamic expression — can't verify (warning) |
UC006 |
the pinned model mismatches the work the prompt describes (warning) |
UC007 |
effort exceeds the model's cap, e.g. sonnet @ xhigh (warning) |
UC008 |
an alwaysOpus role (orchestrator, consolidation, …) pins a cheaper tier (warning) |
The scanner runs on a hand-rolled, zero-dependency JS tokenizer, so it's robust to template literals, spreads, optional-call agent?.(), and dynamic model values — and an agent( inside a prompt string or comment is prose, never a call. Fan-out detection covers .map/.flatMap/forEach/for…of/Promise.all/Array.from/pipeline. --json for CI, --fix to auto-insert the default model on the unambiguous cases (UC001/UC002), --quiet to print only the problems. UC006–UC008 are advisory warnings and never fail the build on their own; exit code is non-zero only when pin-presence errors (UC001–UC004) are found.
Audit your history
ultracost audit [dir] scans <dir>/**/workflows/scripts/*.js (default ~/.claude/projects) and reports the totals — how many stages exist and how many would silently inherit the session model:
$ ultracost audit ~/.claude/projects
ultracost audit
scanned 22 script(s) under ~/.claude/projects
agent() stages 137
pinned 4
unpinned 128 (UC001/UC002 — inherit the session model)
banned 0 (UC003)
inherit 1 (UC004)
dynamic 4 (UC005 — options is a variable)
unpinned ratio 93.4%
Numbers above are illustrative; run it to see your own.
--jsonemits the totals for dashboards or CI.
Customizing the policy
Edit ~/.claude/ultracost/policy.json, then re-run ultracost init to recompile the rules:
{
"neverUse": ["haiku"],
"allowInherit": false,
"default": "opus",
"tieBreaker": "opus",
"tiers": {
"opus": { "model": "opus", "effort": "xhigh" },
"sonnet": { "model": "sonnet", "effort": "high" }
},
"alwaysOpus": ["orchestrator", "planner", "final-synthesis"]
}
See docs/policy.md for the full reference.
Use in CI
- run: npx ultracost check . --json
Fails the build if any committed workflow script has a stage that would inherit the session model.
How it compares
ultracost is intentionally narrow. General-purpose routers (claude-router, claude-smart-router, claude-model-changer, model-matchmaker) score every prompt and route the main loop at runtime. Linters like claudelint validate a file-based agent's model: value. ultracost targets the dynamic-workflow / ultracode path and is, as far as we can tell, the only tool that statically detects an unpinned inline agent()/pipeline() stage, flags a pin that mismatches the work the prompt describes, and reconciles its own cost estimate against real per-stage token usage. Cost tooling like ccusage, tokencast, and tokentoll informed the transcript-parsing, calibration, and cost-diff approaches (reimplemented clean-room). See NOTICE for prior-art credits.
Documentation
- Showcase — a live
ultracoderun — policy injection → guard → cost gate → confirm, end to end, unprompted - Architecture
- Policy reference
- Why ultracode needs this
- Testing guide — sandbox, plugin, npm, and live Claude Code CLI checks
- Publishing & recognition — marketplaces, awesome lists, launch
Versioning & releases
Semantic versioning. See CHANGELOG.md. Tagged releases (vX.Y.Z) publish to npm and GitHub Releases via CI.
Configured for GitHub
danielkremen818/ultracost. If you fork it, update the handle in the install commands and badges,package.json,CHANGELOG.md, and.claude-plugin/plugin.json. Seedocs/PUBLISHING.mdfor the full pre-publish checklist.
License
MIT © Daniel Kremen. Clean-room implementation; prior art credited in NOTICE.
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found