Tokenomy
Health Warn
- License — License: MIT
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Low visibility — Only 5 GitHub stars
Code Fail
- rm -rf — Recursive force deletion command in package.json
- rm -rf — Recursive force deletion command in scripts/build.sh
- exec() — Shell command execution in src/analyze/miner.ts
Permissions Pass
- Permissions — No dangerous permissions requested
This tool is a hook and analysis toolkit for AI coding agents. It transparently trims bloated MCP responses, limits oversized file reads, and deduplicates repeated tool calls to help agents spend tokens on reasoning rather than parsing large amounts of unnecessary data.
Security Assessment
The overall risk is rated as High. The tool requests no dangerous permissions, but the underlying code exhibits several alarming behaviors. The package contains hardcoded `rm -rf` recursive force deletion commands in both the `package.json` file and the `scripts/build.sh` script. Additionally, shell command execution functions were detected in the source code. While the tool is designed to intercept and modify shell commands, developers should be extremely careful, as unsanitized or unexpected command execution could easily lead to accidental data loss on the host system.
Quality Assessment
The project uses the standard MIT license and was updated very recently, indicating active development. However, it currently suffers from extremely low community visibility, having accumulated only 5 GitHub stars. While the project claims to have hundreds of passing tests, the lack of community adoption means the tool has not been widely vetted by external security researchers or developers.
Verdict
Use with caution—wait for broader community adoption and further security audits before integrating this into sensitive environments.
Tokenomy — A surgical toolkit for AI coding CLIs. Cut tokens, keep context, build faster.
Stop burning tokens on tool chatter.
A surgical hook + analysis toolkit for Claude Code, Codex CLI, Cursor, Windsurf, Cline, and Gemini that transparently trims bloated MCP responses, clamps oversized file reads, bounds verbose shell commands, dedupes repeat calls, compresses agent memory files, and benchmarks waste — so your agent spends tokens on thinking, not on parsing 40 KB of Jira JSON for the third time.
🩻 The pain
Your last long coding-agent session — Claude Code or Codex — looked something like this:
Assistant → get_jira_issue ← 40 KB
Assistant → search_jira_jql ← 85 KB
Assistant → read src/server.ts ← 18 KB (2000 lines)
Assistant → read src/server.ts ← 18 KB (same file, again)
Assistant → read src/config.ts ← 14 KB
...
200 K tokens gone before the agent did any real work. Compaction kicks in too late, and it's lossy.
Tokenomy plugs the holes the agent hook contracts let you close — with zero proxy, no monkey-patching:
| Surface | Works with | Mechanism | What it kills |
|---|---|---|---|
PostToolUse on mcp__.* |
Claude Code | updatedMCPToolOutput (multi-stage: redact → stacktrace collapse → schema-aware profile → byte trim) |
10–50 KB MCP responses from Atlassian, Notion, Gmail, Asana, HubSpot, Intercom… |
PostToolUse on Bash |
Claude Code | stacktrace frame compressor | Deep Jest/Pytest/Java/Rust/Go failure traces after tests/lints fail |
PostToolUse on mcp__.* |
Claude Code | duplicate-response dedup (per session) | Repeated identical tool calls — second hit returns a pointer stub, not a 30 KB refetch |
PreToolUse on Read |
Claude Code | updatedInput |
Unbounded reads on huge source files |
PreToolUse on Bash |
Claude Code | rewrites tool_input.command |
Unbounded verbose shell output — git log, find, ls -R, ps aux, docker logs, journalctl, tree |
PreToolUse on Write |
Claude Code | additionalContext |
Reinventing existing repo work or mature utility libraries from scratch |
UserPromptSubmit (alpha.22+) |
Claude Code | additionalContext on every user turn |
Agent planning without first checking graph for existing code, blast radius, or maintained libraries |
SessionStart (beta.1+, Golem) |
Claude Code | additionalContext once per session + per-turn reinforcement |
Verbose assistant replies — output tokens cost 5× input on Sonnet |
SessionStart + UserPromptSubmit hooks |
Codex CLI (experimental) | user-scoped ~/.codex/hooks.json + features.codex_hooks |
Golem output rules + prompt-classifier nudges in Codex sessions |
tokenomy-graph MCP server |
Claude Code · Codex CLI | graph + Raven tools over stdio | Brute-force Read sweeps of the codebase — agent gets focused context from a pre-built graph + repo/branch/package alternatives |
tokenomy raven (beta.4+) |
Claude Code primary + Codex CLI reviewer | opt-in handoff packets, review recording, deterministic compare, PR readiness | Manual copy/paste between agents — Claude briefs Codex with a compact packet; Codex records review; compare + pr-check gate merge without full transcript reads |
tokenomy compress (beta.2+) |
Any repo | deterministic agent-rule file cleanup + optional local Claude rewrite | Bloated CLAUDE.md, AGENTS.md, .cursor/rules, .windsurf/rules loaded every session |
tokenomy status-line (beta.2+) |
Claude Code | settings.json.statusLine command |
Invisible installs — shows active state, Golem mode, graph freshness, and today's savings |
tokenomy bench (beta.2+) |
CLI | deterministic scenario runner | Reproducible savings tables for README/release notes |
tokenomy analyze |
Claude Code · Codex CLI transcripts | Walks ~/.claude/projects/**/*.jsonl + ~/.codex/sessions/**/*.jsonl, replays Tokenomy rules with a real tokenizer |
Tells you exactly how much you've been wasting, by tool, by day, by rule |
tokenomy diff (beta.3+) |
CLI | replay one historical call through the rule stack with per-rule attribution | Debug why a tool call saved (or didn't save) what you expected |
tokenomy learn (beta.3+) |
CLI | mines savings.jsonl → proposes config patches |
Project-specific tuning without hand-writing config |
tokenomy budget (beta.3+) |
Claude Code PreToolUse |
advisory additionalContext when a call would push session past a token cap |
Runaway session cost — warns before the bloated call, not after |
| Pre-call redact (beta.3+) | Claude Code PreToolUse on Bash/Write/Edit |
redacts secrets in Bash headers/URLs and Write/Edit content; warns on bare-arg credentials |
Closes the symmetric leak — secrets in inputs no longer reach the assistant |
| Golem auto-tune (beta.3+) | Claude Code | tokenomy analyze --tune picks a Golem mode from your real reply-size p95 |
Right-sized terseness without tuning by hand |
tokenomy ci (beta.3+) |
GitHub Actions | action.yml + tokenomy ci format |
Per-PR token-waste summary comment straight from CI |
Each live trim appends a row to ~/.tokenomy/savings.jsonl with measured bytes-in / bytes-out, so you can prove the savings. Run tokenomy report for a TUI + HTML digest, or tokenomy analyze to benchmark real historical waste from session transcripts.
✨ Features
Tokenomy is organized around live hooks, graph retrieval, agent nudges, compression tools, and observability. Install once, and every Claude Code hook starts working across sessions; graph MCP registration also works across supported agents.
🔻 Live token trimming
Automatic response shrinking the moment a tool call finishes (or is about to fire). Zero config to get started.
- MCP response trimming — schema-aware profiles for Atlassian, Linear, Slack, Gmail, GitHub, Notion; generic redact + stacktrace collapse + byte-trim fallback for the long tail. Unknown top-level keys flow through untouched.
- MCP dedup — identical MCP responses within a 30-minute session window return a pointer stub instead of a full refetch.
- Read clamp — unbounded
Readon a large source file gets rewritten to an explicitlimit: Nwith anadditionalContextnote so the agent can offset-Read further regions on demand. Self-contained docs (.md,.rst,.txt,.adoc) below the doc-passthrough threshold skip clamping entirely. - Bash input-bounder — verbose shell invocations (
git log,find,ls -R,ps aux,docker logs,journalctl,kubectl logs,tree) get rewritten toset -o pipefail; <cmd> | awk 'NR<=N'. Exit codes preserved; exit-status-sensitive commands, streaming forms, destructivefindactions, and user-piped compound commands all pass through. - Bash stacktrace compressor (beta.2+) — failed test/lint output from
Bashgets semantic frame trimming when it contains at leastmcp.shell_trace_trim.min_frames_to_triggerframes (default 6) and still passes the normal savings gate (gate.min_saved_bytes, default 4000). Small traces intentionally pass through unchanged so readable failures stay readable.
🎯 Agent nudges (redirect waste before it happens)
Nudges cost nothing if the agent was going to do the right thing anyway — and save 10–50 k tokens per occurrence when it wasn't.
OSS-alternatives Write nudge (alpha.18+) — when Claude creates a new file in a utility-ish path (
src/utils/**,src/lib/**,pkg/**,internal/**, etc.) above 500 B, Tokenomy recommendsfind_oss_alternativesfirst. The agent checks this repo + local branches + npm / PyPI / pkg.go.dev / Maven Central before writing anything.Golem (beta.1+) — opt-in terse-output-mode plugin. Injects deterministic style rules at
SessionStartand reinforces per-turn. Four modes:lite(drop hedging) →full(declarative sentences) →ultra(max 3 lines, single-word confirmations) →grunt(fragments, dropped articles/pronouns, occasional "ship it." / "nope." / "aye." — caveman-adjacent energy). Safety gates always preserve code, commands, warnings, and numerical results verbatim. Enable:tokenomy golem enable [--mode=lite|full|ultra|grunt]. Off by default; attacks the one token surface the other features leave alone — assistant output.Prompt-classifier nudge (alpha.22+) — fires once per user turn, before Claude plans. Classifies intent and points at the right graph tool:
build | implement | add | create→find_oss_alternativesrefactor | rename | migrate | extract | replace→find_usages+get_impact_radiusremove | delete | drop | deprecate→get_impact_radiusreview | audit | blast radius | what changed→get_review_context
Conservative by design: skips confirmations under 20 chars, short-circuits if the prompt already names a graph tool, and graph-dependent intents only fire when a graph snapshot exists for the repo.
Raven bridge (unreleased) — opt-in Claude-first bridge for teams that also have Codex CLI. Enable with
tokenomy raven enable; Tokenomy refuses to enable it whencodexis missing onPATH, so the feature cannot half-turn-on unlessraven.requires_codexis explicitly set false. Once enabled, Claude Code gets session and per-turn guidance for review/handoff/readiness prompts, while the MCP tools create compact handoff packets, record agent reviews, compare findings, and report PR readiness. The prompt matcher is intentionally broad, so occasional false-positive review reminders cost one line and do not block tool use. No automatic agent subprocesses and no worktree dispatch in v1.
🧠 Code-graph MCP (surgical context retrieval)
tokenomy-graph — a stdio MCP server exposing graph tools that replace brute-force Read sweeps of the codebase, plus Raven handoff/review tools when Raven is enabled. Works with both Claude Code and Codex CLI.
| Tool | What it does | Budget |
|---|---|---|
build_or_update_graph |
Build or refresh the local code graph for the current repo | 4 KB |
get_minimal_context |
Smallest useful neighborhood around a file or symbol | 8 KB |
get_impact_radius |
Reverse deps + suggested tests for changed files or symbols | 16 KB |
get_review_context |
Ranked hotspots + fanout across changed files | 4 KB |
find_usages |
Direct callers, references, importers of a file or symbol | 16 KB |
find_oss_alternatives |
Repo + branch + package-registry search with distinct-token ranking | 8 KB |
create_handoff_packet |
Create a compact Raven packet with git diff, graph context, and session hints | 8 KB |
read_handoff_packet |
Read the latest or named Raven packet | 8 KB |
record_agent_review |
Persist Claude/Codex/human review findings for a packet | 4 KB |
list_agent_reviews |
List reviews recorded against a packet | 8 KB |
compare_agent_reviews |
Deterministically match findings and surface disagreements | 8 KB |
get_pr_readiness |
Apply Raven's merge verdict rules: no, risky, or yes | 8 KB |
record_decision |
Persist the human merge/fix/investigate decision with rationale | 4 KB |
TypeScript / JavaScript AST via the TS compiler (no type checker); tsconfig.paths / jsconfig.paths resolved (alpha.17+) so @/hooks/foo and friends link to real source files on Next.js, Vite, Nuxt, monorepos. Read-side auto-refresh (alpha.15+) rebuilds on demand when files change between queries. Fail-open everywhere.
📊 Observability + retrospection
~/.tokenomy/savings.jsonl— every trim and nudge appends a row with measured bytes-in / bytes-out and estimated tokens saved. Tail it:tail -f ~/.tokenomy/savings.jsonl.tokenomy report— TUI + HTML digest of recent trims grouped by tool, by reason, by day. Answers "am I actually saving tokens?"tokenomy analyze— walks~/.claude/projects/**/*.jsonland~/.codex/sessions/**/*.jsonl, replays Tokenomy rules against real historical transcripts with a real tokenizer, surfaces patterns like Wasted-probe incidents. Beta.3+: adds per-tool p50/p95 latency columns and writes~/.tokenomy/analyze-cache.jsonfor the budget rule. Pass--tuneto writegolem-tune.jsontoo. Answers "where have I been wasting tokens all along?"tokenomy compress— deterministic cleanup for agent instruction files. Preserves frontmatter, fenced/indented code, URLs, and command examples;--in-placewrites a mandatory.original.mdbackup, andrestoreswaps it back.tokenomy bench— deterministic benchmark runner with JSON and markdown output for reproducible release/README tables.tokenomy doctor— health checks covering hook install, settings.json integrity, statusline registration, manifest drift, MCP registration, hook perf budget, and agent detection. Run anytime; run automatically on everytokenomy update.tokenomy update— single-command self-update: runsnpm install -g tokenomy@latest, re-stages the hook, re-registers the graph MCP (alpha.20+) if one was previously configured. Config and logs preserved.
🆕 Beta.3 additions
tokenomy diff(beta.3+) — replay one historical tool call through the current rule stack and show the per-rule savings breakdown. Selectors:--call-key <sha>/--tool <name> [--grep <str>]/--session <id> --index <N>. Output includes a capped response preview so you can see exactly what the rule saw.tokenomy learn(beta.3+) — mines~/.tokenomy/savings.jsonland proposes config patches (newbash.custom_verboseentries, raiseread.injected_limit, enableredact.pre_tool_use). Read-only by default;--applywrites~/.tokenomy/config.jsonwith a timestamped backup.- Pre-call redact (beta.3+) — extends secret redaction to
PreToolUseonBash/Write/Edit. Bash header/URL secrets are redacted and warned; bare-arg secrets are warn-only so we don't mangle intentional keys in scripts. Multi-line headers (\-continued) are handled. Opt-in viacfg.redact.pre_tool_use: true. - Golem auto-tune (beta.3+) —
cfg.golem.mode = "auto"reads~/.tokenomy/golem-tune.json(written bytokenomy analyze --tune) and picks a mode from your actual p95 reply size. Thresholds:<800 B → lite,<2 KB → full,<5 KB → ultra,≥ 5 KB → grunt. - Incremental graph updates (beta.3+) —
cfg.graph.incremental: trueenables delta rebuilds that re-parse only stale files + direct importers. Falls back to full rebuild if tsconfig/exclude fingerprints shift or >40% of files changed. Opt-in. tokenomy budgetpre-flight gate (beta.3+) — advisoryPreToolUserule that estimates incoming-call response size from analyze cache and warns viaadditionalContextwhen the call would push the session pastcfg.budget.session_cap_tokens. Never rejects. Default off.tokenomy ci+ GitHub Action (beta.3+) —action.ymlat repo root plus atokenomy ci format --input=<analyze.json>CLI that converts analyze JSON into a PR-ready markdown comment. Inputs validated + HTML-escaped.- Statusline version + update marker (beta.3+, marker beta.5+) —
tokenomy status-lineshows the version tag inline, e.g.[Tokenomy v0.1.1-beta.5 · GOLEM-GRUNT · 15.0k saved]. After atokenomy update --check, a↑is appended when a newer release exists on npm (e.g.v0.1.1-beta.5↑). The cache at~/.tokenomy/update-cache.jsonis ignored past 14 days. - Per-tool p95 latency (beta.3+) —
analyze+reportshow p50/p95 wall-clock latency per tool (from pairedtool_use/tool_resulttimestamps).
🪶 Raven — cross-agent handoff + review (beta.4+)
Raven turns Claude Code and Codex CLI into a primary/reviewer pair. Claude works; Codex independently reviews; Raven compares findings and gates merge readiness. No subprocess voodoo — packets and reviews are plain JSON/markdown under ~/.tokenomy/raven/<repo>/, and every downstream check verifies packet.head_sha against the current HEAD.
tokenomy raven enable— opt-in; refuses unlesscodexCLI is on PATH (override viacfg.raven.requires_codex = false). Installs hooks, ensures the Raven store, and registers the Raven MCP tools.tokenomy raven brief [--goal=<text>]— snapshots git state + graph review context + impact radius into a compact packet. Diffs ranked by hotspot score, per-file + total byte budgets enforced, hot-path files win.tokenomy raven status/clean [--keep=<N>] [--older-than=<days>] [--dry-run]— read-only status; TTL + count prune for old artifacts.tokenomy raven compare— deterministic matcher (file + line + severity exact, title bigram-Dice ≥ 0.85). No LLM in the loop. Outputs agreements, disagreements (risky-unique high/critical only),recommended_action: merge | fix-first | investigate.tokenomy raven pr-check— verdict rules:noon any unresolvedcriticalfinding / stale HEAD / zero reviews.riskyon graph stale / dirty tree / high-severity disagreement. Exit code 2 when blocking.tokenomy raven install-commands— writes.claude/commands/raven-brief.md+raven-pr-check.md. Never clobbers.Agent-facing MCP tools (all budget-clipped, all refuse stale packets):
create_handoff_packet,read_handoff_packet,record_agent_review,list_agent_reviews,compare_agent_reviews,get_pr_readiness,record_decision.Raven in report + analyze (beta.5+) — Raven activity (packets, reviews, comparisons, decisions, last activity, repo count) now surfaces in both
tokenomy report(TUI + HTML) andtokenomy analyze, so the bridge is visible alongside token savings.
Out-of-scope for beta.4: review --agent auto-subprocessing (human-in-the-loop flow instead — print the packet path, run Codex in a second terminal, it calls record_agent_review) and dispatch --worktree (follow-up PR).
💸 Real savings from one dogfood session
Tokenomy run on its own repo in a fresh Claude Code session (Bash verbose commands, Read on a large file, five graph MCP queries). tokenomy report, verbatim:
Window: 2026-04-20T01:32:56Z → 2026-04-20T01:55:10Z (~22 minutes)
Total events: 14
Bytes trimmed: 1,374,113 → 260,000 (−1,114,113)
Tokens saved (est): 285,000
~USD saved: $0.8550
Top tools by tokens saved
Bash 11 calls 247,500 tok (bash-bound:git-log / :find / :ls-recursive / :ps)
Read 3 calls 37,500 tok (read-clamp on 89 KB package-lock.json + 2 others)
~20 K tokens saved per event. A dev spending 4 agent-hours a day reclaims ~$10/week. Numbers come from ~/.tokenomy/savings.jsonl — reproduce via CONTRIBUTING.md's dogfood playbook.
Zero-touch install via Claude Code
Don't want to read the Quickstart? Paste this into Claude Code and let the agent drive the install end-to-end. It'll install the package, register the graph MCP for this repo, enable Golem in grunt mode, and verify with tokenomy doctor:
Install tokenomy for me.
1. Run: npm install -g tokenomy
2. Run: tokenomy init --graph-path "$PWD"
(this patches ~/.claude/settings.json, registers the tokenomy-graph
MCP server for this repo, and builds the code graph)
3. Run: tokenomy golem enable --mode=grunt
(enables terse assistant-reply mode — fragments over sentences,
safety-gated for code and commands)
4. Run: tokenomy doctor
(confirm all checks pass)
5. Tell me to fully quit Claude Code (Cmd+Q) and reopen so the new
hooks + MCP server + SessionStart preamble load cleanly.
After I restart, tokenomy's hooks trim MCP/Bash/Read waste,
the graph MCP answers find_usages / get_impact_radius queries,
and Golem keeps your replies terse.
After the agent finishes and you restart Claude Code, run any prompt — you should see [Tokenomy: …] in the statusline and terse replies from Golem. To tune Golem down: tokenomy golem enable --mode=full (more natural) or --mode=lite (subtle). Disable entirely: tokenomy golem disable.
⚡ Quickstart
Claude Code (full integration — live hooks + graph + analyze):
npm install -g tokenomy
tokenomy init # patches ~/.claude/settings.json (backed up first)
tokenomy doctor # all checks passing
# restart Claude Code — then use it normally
Codex CLI (graph MCP + transcript analysis + experimental prompt/session hooks). tokenomy init --graph-path auto-registers the graph MCP server with both agents when each CLI is on your PATH. For Codex, Tokenomy also writes user-scoped ~/.codex/hooks.json for SessionStart and UserPromptSubmit and enables features.codex_hooks:
npm install -g tokenomy
tokenomy init --graph-path "$PWD" # registers tokenomy-graph in both agents AND builds the graph
tokenomy analyze # benchmarks ~/.claude + ~/.codex transcripts
Codex hook support is intentionally partial: current Codex hooks can inject session/prompt context, so Golem and prompt-classifier nudges work. Claude-style MCP output mutation, Read input mutation, and Bash input rewriting remain Claude Code only until Codex exposes compatible hook contracts.
Since alpha.15, init --graph-path builds the graph for you in a single shot; pass --no-build to skip (CI, monorepos with pre-built graphs).
Codex-only / manual registration: codex mcp add tokenomy-graph -- tokenomy graph serve --path "$PWD".
Raven bridge (Claude Code primary, Codex CLI reviewer):
tokenomy raven enable # requires `codex` on PATH unless raven.requires_codex=false
tokenomy raven brief # writes the compact packet Claude/Codex can review
tokenomy raven pr-check # no / risky / yes merge readiness verdict
Raven is off by default. Enabling it registers the existing Tokenomy hook/MCP surface and adds Claude Code nudges for review/handoff/readiness turns. It does not auto-run Codex, create worktrees, or install .claude/commands unless you explicitly run tokenomy raven install-commands.
Cross-agent graph install
tokenomy init --graph-path "$PWD" auto-detects graph-capable agents and
registers tokenomy-graph where possible. Force one target with --agent,
or inspect detection first:
tokenomy init --list-agents
tokenomy init --agent cursor --graph-path "$PWD"
tokenomy uninstall --agent cursor
| Agent | Hooks | Graph MCP | Install target |
|---|---|---|---|
| Claude Code | ✓ | ✓ | ~/.claude/settings.json + ~/.claude.json |
| Codex CLI | partial | ✓ | codex mcp add tokenomy-graph ... + ~/.codex/hooks.json |
| Cursor | — | ✓ | ~/.cursor/mcp.json |
| Windsurf | — | ✓ | ~/.codeium/windsurf/mcp_config.json |
| Cline | — | ✓ | ~/.cline/mcp_settings.json |
| Gemini CLI | — | ✓ | ~/.gemini/settings.json |
Compress agent instruction files
tokenomy compress status
tokenomy compress CLAUDE.md --diff
tokenomy compress CLAUDE.md --in-place
tokenomy compress /path/to/CLAUDE.md --in-place --force # explicit outside-cwd override
tokenomy compress restore CLAUDE.md
Pre-
1.0. Every release is-beta.N; breaking changes may land before1.0.0(see CHANGELOG). Pin for stability:npm install -g [email protected]. Upgrade with one command —tokenomy update(runsnpm install -g+ re-stages the hook + is idempotent; config + logs preserved). Check without installing:tokenomy update --check. Pin an exact release:tokenomy [email protected]ortokenomy update --version 0.1.1-beta.5. Bleeding edge: see Development.
Watch trims live — tail -f ~/.tokenomy/savings.jsonl:
{"ts":"...","tool":"mcp__claude_ai_Atlassian__searchJiraIssuesUsingJql","bytes_in":28520,"bytes_out":2800,"tokens_saved_est":6430,"reason":"mcp-content-trim"}
{"ts":"...","tool":"Read","bytes_in":48920,"bytes_out":15000,"tokens_saved_est":8480,"reason":"read-clamp"}
Tally one-liner: grep -oE '"tokens_saved_est":[0-9]+' ~/.tokenomy/savings.jsonl | awk -F: '{s+=$2;n++}END{printf "trims:%d tokens:%d\n",n,s}'.
🧠 How it actually works
Claude Code live hooks
PreToolUse (Read) ─► clamp huge files (doc-passthrough for .md/.mdx/.rst/.txt/.adoc)
PreToolUse (Bash) ─► rewrite `CMD` → `set -o pipefail; CMD | awk 'NR<=N'`
PreToolUse (Write) ─► nudge toward OSS alternatives before new utility files
PostToolUse (mcp__.*) ─► dedup → redact → stacktrace → profile → shape-trim → byte-trim
PostToolUse (Bash) ─► collapse deep test/lint stack traces after savings gate
└─ `{_tokenomy:"full"}` in args = passthrough (caller opt-out)
every trim → ~/.tokenomy/savings.jsonl → `tokenomy report` (TUI + HTML)
Shared (Claude Code + Codex CLI, stdio)
tokenomy-graph MCP ─► 6 tools: build_or_update_graph, get_minimal_context,
get_impact_radius, get_review_context, find_usages,
find_oss_alternatives
(graph tools LRU-cached on `meta.built_at + budget`;
find_oss_alternatives uncached — live repo/branch state)
tokenomy analyze ─► walks ~/.claude/projects + ~/.codex/sessions,
replays the full pipeline with a real tokenizer,
surfaces Wasted-probe incidents (over-trim failure mode)
Under the hood
Multi-stage MCP pipeline (PostToolUse), stage order:
- Caller opt-out. If
tool_input._tokenomy === "full", skip every stage and return passthrough. Note: some strict MCP servers may reject the extra key — fallback is a per-tooltools: {"<glob>": {disable_profiles: true}}config override. - Duplicate dedup. Same
(tool, canonicalized-args)seen earlier in-session withincfg.dedup.window_seconds→ body replaced by a pointer stub. Session-scoped ledger at~/.tokenomy/dedup/<session>.jsonl. - Secret redactor. Regex sweep for AWS/GitHub/OpenAI/Anthropic/Slack/Stripe/Google keys, JWTs, Bearer tokens, PEM blocks →
[tokenomy: redacted <kind>]. Force-applies regardless of gate (security > tokens). - Stacktrace collapser. Node/Python/Java/Ruby — keeps header + first + last 3 frames.
- Schema-aware profiles. Keep a curated key-set (title/status/assignee/body/…), trim long strings, mark overflowing arrays. Built-ins for Atlassian Jira (issue, search, transitions, issue-types, projects) + Confluence (page, spaces) + Linear + Slack + Gmail + GitHub; users can register custom profiles. Optional
skip_when?(input)predicate lets a profile opt out based on caller intent. - Shape-aware trim (fallback). If no profile matched and the payload is still over budget, detect homogeneous row arrays (top-level or wrapped in
{transitions|issues|values|results|data|entries|records: [...]}) and compact per-row instead of blind head+tail. Protects enumeration endpoints from losing rows. - Byte-trim fallback. If the total still exceeds
cfg.mcp.max_text_bytes, head+tail the remaining text with[tokenomy: elided N bytes]+ a footer hinting Claude how to refetch full detail.
Invariants: content.length never shrinks, is_error flows through, non-text blocks (images, resources) pass untouched, unknown top-level keys preserved. reason reports which stages fired (e.g. redact:3+profile:atlassian-jira-issue, shape-trim+mcp-content-trim).
Read clamp (PreToolUse). Explicit limit/offset → passthrough. Self-contained docs (.md/.mdx/.rst/.txt/.adoc under doc_passthrough_max_bytes) → passthrough unclamped. Otherwise stat; under threshold → passthrough; over → inject limit: N + an additionalContext note so the agent knows it can offset-Read more regions.
Bash input-bounder (PreToolUse). Detects unbounded verbose shell invocations (git log, find, ls -R, ps aux, docker logs, journalctl, kubectl logs, tree) and rewrites to set -o pipefail; <cmd> | awk 'NR<=200'. Awk consumes producer output (no SIGPIPE); pipefail preserves exit codes. Excludes exit-status-sensitive commands (git diff --exit-code, npm ls, git status --porcelain), streaming forms (-f/--follow/watch/top), and destructive find actions (-exec, -delete). User pipelines, redirections, compound commands (;/&&/||), subshells, heredocs all passthrough. head_limit validated as integer in [20, 10_000] before interpolation (no command injection).
Bash stacktrace compressor (PostToolUse, beta.2+). Detects Node, Python, Java, Rust, and Go stack frames in Bash output and preserves the assertion/error message, test names, diffs, summaries, first 3 frames, and last 2 frames. It only fires when the trace has at least 6 frames and the saved bytes pass the global trim gate (gate.min_saved_bytes, conservative default 8000 after aggression scaling; base default 4000). This means short or already-readable failures pass through unchanged.
OSS-alternatives nudge (PreToolUse Write + MCP). When Claude Code is about to create a new utility-like file (src/utils/**, src/lib/**, pkg/**, internal/**, Java src/main/java/**, etc.) above nudge.write_intercept.min_size_bytes, Tokenomy appends additionalContext recommending mcp__tokenomy-graph__find_oss_alternatives before bespoke implementation. The Write input is unchanged; Tokenomy never blocks the file write. The MCP tool searches the current repo, other local branches, npm, PyPI, pkg.go.dev, and Maven Central based on project files or explicit ecosystems, then returns {query, ecosystems, repo_results, results, summary, hint}. It is fail-open and budget-clipped like the graph tools. Privacy note: the search description/keywords go to public package registries when registry search runs; local repo/branch search stays on the machine. Disable the proactive Write nudge with tokenomy config set nudge.write_intercept.enabled false and avoid the MCP call for sensitive proprietary descriptions.
Prompt-classifier nudge (UserPromptSubmit, alpha.22+). Fires once per user turn, BEFORE Claude plans. Classifies the prompt's intent and injects additionalContext pointing at the right tokenomy-graph MCP tool: build|implement|add|create → find_oss_alternatives; refactor|rename|migrate|extract|replace → find_usages + get_impact_radius; remove|delete|drop|deprecate → get_impact_radius; review|audit|blast radius|what changed → get_review_context. Conservative gates: skips prompts under 20 chars, skips when the prompt already mentions a tokenomy-graph tool, graph-dependent intents only fire when a graph snapshot exists for the repo. Per-intent toggles via nudge.prompt_classifier.intents.{build,change,remove,review}. Disable entirely: tokenomy config set nudge.prompt_classifier.enabled false.
Statusline badge (beta.2+). tokenomy init patches settings.json.statusLine to run tokenomy status-line on every Claude Code UI tick. The command reads today's savings.jsonl, aggregates by tool/reason, and emits a one-liner like [Tokenomy: 4.2k saved | GOLEM-GRUNT | graph fresh]. Must return in < 50 ms — uses a bounded read of the log and no external I/O. Fails open: missing config or parse error → empty string → Claude Code renders nothing, no UI breakage. Pure observability; never affects tool calls.
Benchmark harness (tokenomy bench, beta.2+). Deterministic scenario runner — no live network, all inputs are captured fixtures committed to fixtures/bench/. Six scenarios exercise the full rule set (read-clamp-large-file, bash-verbose-git-log, mcp-atlassian-search, shell-trace-trim, compress-agent-memory, golem-output-mode). Each scenario runs the real rule code against its fixture, measures bytes_in / bytes_out / tokens_saved / wall_ms with a real tokenizer, and emits JSON or markdown. tokenomy bench compare <a.json> <b.json> surfaces per-scenario regressions across runs — used in CI and for reproducible README tables. Saves 145 k tokens ($0.44) on the current fixture set, documented in docs/bench/RUN.md.
Cross-agent install (tokenomy init --agent=<name>, beta.2+). Per-agent adapter files under src/cli/agents/. Detection is non-destructive — each adapter returns {detected, detail, install()} based on CLI-binary presence + config-dir presence. tokenomy init auto-runs every detected adapter; --agent=<name> forces a single target; --list-agents prints the detection table. Every write is atomic (temp file + rename) with a .tokenomy-bak-<ts> backup next to the target. Symmetric uninstall via tokenomy uninstall --agent=<name>. Supported: claude-code (hooks + MCP), codex (MCP via codex mcp add), cursor (~/.cursor/mcp.json), windsurf (~/.codeium/windsurf/mcp_config.json), cline (~/.cline/mcp_settings.json), gemini (~/.gemini/settings.json).
Fail-open is non-negotiable. Malformed stdin / parse errors / unknown shapes → exit 0 with empty stdout. 10 MB stdin cap. 2.5 s internal timeout. Exit code 2 (blocking) never used. Breaking the agent is worse than wasting tokens.
🎛️ Configure
~/.tokenomy/config.json (per-repo override: ./.tokenomy.json). Config changes take effect immediately — only the initial init requires a Claude Code restart.
{
"aggression": "conservative", // ×2 thresholds. balanced=×1, aggressive=×0.5
"gate": {
"always_trim_above_bytes": 40000, // huge responses: always trim
"min_saved_bytes": 4000, // tiny savings: not worth the context-switch
"min_saved_pct": 0.25 // percentage floor otherwise
},
"mcp": {
"max_text_bytes": 16000,
"per_block_head": 4000,
"per_block_tail": 2000,
"profiles": [], // user-defined schema-aware trim profiles
"disabled_profiles": [], // names of built-in profiles to skip
"shape_trim": { // row-preserving fallback for unprofiled inventory responses
"enabled": true,
"max_items": 50,
"max_string_bytes": 200
},
"shell_trace_trim": {
"enabled": true,
"max_preserved_frames_head": 3,
"max_preserved_frames_tail": 2,
"min_frames_to_trigger": 6
}
},
"read": {
"enabled": true,
"clamp_above_bytes": 40000,
"injected_limit": 500,
"doc_passthrough_extensions": [".md", ".mdx", ".rst", ".txt", ".adoc"],
"doc_passthrough_max_bytes": 64000 // docs below this cap skip clamping
},
"redact": {
"enabled": true,
"disabled_patterns": [] // e.g. ["jwt"] if you carry many non-secret JWTs
},
"dedup": {
"enabled": true,
"min_bytes": 2000, // don't dedup tiny responses
"window_seconds": 1800 // dedup repeats within 30 min
},
"nudge": {
"enabled": true,
"oss_search": {
"timeout_ms": 5000, // per-registry search timeout
"min_weekly_downloads": 1000, // npm popularity proxy threshold
"max_results": 5, // hard-capped at 10
"ecosystems": ["npm"] // fallback when project files don't imply one
},
"write_intercept": {
"enabled": true,
"paths": [
"src/utils/**",
"src/util/**",
"src/lib/**",
"src/hooks/**",
"src/helpers/**",
"src/services/**",
"src/parsers/**",
"src/validators/**",
"src/formatters/**",
"src/middleware/**",
"pkg/**",
"internal/**",
"cmd/**",
"**/utils/**",
"**/util/**",
"**/helpers/**",
"**/validators/**",
"src/main/java/**",
"src/test/java/**"
],
"min_size_bytes": 500
}
},
"tools": { // per-tool overrides (glob keys, most-specific wins)
"mcp__Atlassian__*": { "aggression": "aggressive" },
"mcp__Linear__*": { "disable_profiles": true }
},
"perf": { "p95_budget_ms": 50, "sample_size": 100 },
"report": { "price_per_million": 3.0 },
"log_path": "~/.tokenomy/savings.jsonl",
"disabled_tools": []
}
Common tweaks — tokenomy config set <key> <value>:
tokenomy config set aggression aggressive # trim harder
tokenomy config set read.enabled false # leave Read alone
tokenomy config set read.clamp_above_bytes 20000 # clamp 20 KB+ files
tokenomy config set mcp.max_text_bytes 8000 # tighter MCP budget
tokenomy config set dedup.window_seconds 600 # 10-min dedup window
tokenomy config set redact.enabled false # opt out of redaction
tokenomy config set nudge.write_intercept.enabled false # disable Write nudges
tokenomy config set nudge.oss_search.max_results 8 # return more OSS candidates
Caller opt-out. An agent that knows it needs the full response can pass {"_tokenomy": "full", ...real_args} in the MCP tool input — the pipeline skips every stage. Safer fallback: set tools: {"<glob>": {"disable_profiles": true}} for the specific tool (works even with strict MCP servers that reject unknown keys).
🩺 tokenomy doctor
✓ Node >= 20
✓ ~/.claude/settings.json parses
✓ Hook entries present (PostToolUse + PreToolUse)
✓ PreToolUse matcher covers Read + Bash + Write — Read|Bash|Write
✓ Hook binary exists + executable
✓ Smoke spawn hook (empty mcp call) — exit=0 elapsed=74ms
✓ ~/.tokenomy/config.json parses
✓ Log directory writable
✓ Manifest drift — clean
✓ No overlapping mcp__ hook
✓ Graph MCP registration — tokenomy-graph configured in ~/.claude.json
✓ Statusline registered — tokenomy status-line
✓ Agent detection — claude-code, codex, cursor
✓ Graph MCP SDK available
✓ Hook perf budget — p50=5ms p95=12ms max=14ms (n=30, budget 50ms)
16/16 checks passed
Every check has an actionable remediation hint on failure. For routine repair, run tokenomy doctor --fix — it creates the log directory, chmod +x's the hook binary, and re-patches ~/.claude/settings.json on manifest drift.
📊 tokenomy report
Digest of ~/.tokenomy/savings.jsonl:
tokenomy report # TUI + writes ~/.tokenomy/report.html
tokenomy report --since 2026-04-01 # date filter
tokenomy report --top 20 # more tools in the ranking
tokenomy report --json # machine-readable
tokenomy report --out /tmp/report.html # custom HTML path
Window: 2026-04-16T10:00:00Z → 2026-04-17T12:30:00Z
Total events: 4 Bytes trimmed: 185,000 → 32,000 Tokens saved: 38,250 ~USD: $0.1147
Top tools by tokens saved
mcp__Atlassian__getJiraIssue 2× 16,750 tok
Read 1× 15,000 tok
mcp__Slack__slack_read_channel 1× 6,500 tok
By reason
profile 3× 23,250 tok
read-clamp 1× 15,000 tok
HTML variant adds a daily bar chart. Pricing: tokenomy config set report.price_per_million 15 (Opus-input rate).
🔬 tokenomy analyze
Where report shows live-hook savings, analyze walks the raw session transcripts — Claude Code (~/.claude/projects/**/*.jsonl) + Codex (~/.codex/sessions/**/*.jsonl) — replays the full pipeline over every historical tool call with a real tokenizer, and reports what Tokenomy would have saved.
tokenomy analyze # last 30d, top 10 tools
tokenomy analyze --since 7d # last week
tokenomy analyze --project Tokenomy # project-dir substring filter
tokenomy analyze --session 6689da94 # one session
tokenomy analyze --tokenizer tiktoken # accurate cl100k counts (see below)
tokenomy analyze --json # machine-readable
tokenomy analyze --verbose # per-day breakdown
Output: rounded-box header, per-rule savings bars, top-N waste leaderboard, duplicate hotspots, ⚠ Wasted-probe incidents (same tool, ≥3 distinct-arg calls within 60s — surfaces over-trim failure mode), largest individual results, by-day sparkline. --no-color to disable ANSI.
Tokenizers. heuristic (default, zero-dep, ~±10% on code/JSON); tiktoken (real cl100k_base via js-tiktoken — npm i -g js-tiktoken then --tokenizer=tiktoken); auto picks tiktoken if present.
╭──────────── tokenomy analyze ────────────╮
│ Window: 2026-04-17 → 2026-04-18 │
│ Tokenizer: heuristic (approximate) │
╰──────────────────────────────────────────╯
Summary
Files 4 Sessions 2 Tool calls 299 Duplicates 1
Observed 89,562 tok ($0.2687) Tokenomy would save 1,926 (2.2%)
Savings by rule
Duplicate-response dedup ████████████████████████ 1,926 tok
Read clamp ████░░░░░░░░░░░░░░░░░░░░ 324 tok
Duplicate hotspots (same args)
Read 2× 2,263 tok wasted
⚠ Wasted-probe incidents (≥3 distinct-arg calls / 60s)
mcp__Atlassian__getTransitionsForJiraIssue 4× 10:00:00→10:00:45 2,400 tok observed
analyze is read-only. The parser handles Claude Code's assistant.content[].tool_use → user.content[].tool_result pairing and Codex's payload.tool_call rollout shape.
🔄 Update
tokenomy update # install latest + re-stage hook in one shot
tokenomy update --check # query registry, print installed vs remote, exit 1 if out of date
tokenomy [email protected] # npm-style pin
tokenomy update --version=0.1.1-beta.5 # same, explicit flag
tokenomy update --tag=beta # opt into a non-default dist-tag
Wraps npm install -g tokenomy@<target> and re-runs tokenomy init — the staged hook under ~/.tokenomy/bin/dist/ is a frozen copy taken at init time, so a plain npm upgrade leaves the hook running old code. tokenomy update does both in one call.
Safety:
- Refuses to install over a
npm link-style dev checkout (your local source would be replaced by the published package). Override with--force. - Refuses downgrades when the resolved version is older than what you have installed — catches cases where a pre-release dist-tag lags
latest. Override with--force(warns loudly before proceeding).
Use --check in CI or a daily cron to get a non-blocking "update available" signal without touching the install.
🛑 Uninstall
tokenomy uninstall --purge
Removes both hook entries from ~/.claude/settings.json (matched by absolute command path — no brittle markers), deletes ~/.tokenomy/, and your original settings.json remains backed up at ~/.claude/settings.json.tokenomy-bak-<timestamp>.
🧭 Roadmap
- Phase 1.
PostToolUseMCP trim +PreToolUseRead clamp + CLI + 12-check doctor + savings log (Claude Code). - Phase 2.
tokenomy analyze— walks Claude Code + Codex CLI transcripts, replays rules with a real tokenizer, surfaces waste patterns in a fancy CLI dashboard. - Phase 3. Local code-graph MCP server:
tokenomy-graphstdio server +graph build|status|serve|query|purgeCLI + doctor check. Works with both Claude Code and Codex CLI. TypeScript AST, 5 tools, hard budget caps, fail-open everywhere. - Phase 3.5. Multi-stage PostToolUse pipeline: duplicate-response dedup, secret redaction, stacktrace collapse, schema-aware trim profiles (Atlassian/Linear/Slack/Gmail/GitHub), per-tool config overrides,
find_usagesgraph tool, MCP query LRU cache,tokenomy report(TUI + HTML), hook perf telemetry,doctor --fix. - Phase 4.
PreToolUseBash input-bounder — rewrites verbose unbounded shell commands (git log,find,ls -R,ps aux,docker logs,journalctl, …) to cap their output viaset -o pipefail; <cmd> | awk 'NR<=N'. Exit-status preserved, no command injection (head_limit validated), no rewrites for compound / subshell / heredoc / redirected / user-piped / streaming commands. Claude Code only until Codex supports input mutation. - Phase 4.5. OSS-alternatives-first nudge —
find_oss_alternativesMCP tool with repo/branch/package search plus conservativePreToolUseWrite context for new utility-like files. - Phase 5. Polish — Golem output mode, statusline with live savings counter,
UserPromptSubmitprompt-classifier,tokenomy compress, deterministictokenomy bench, and cross-agent graph MCP installers. - Phase 5.5. Codex hook foothold — user-scoped Codex
SessionStart+UserPromptSubmithook installation for Golem and prompt-classifier nudges, with MCP/PostToolUse mutation explicitly deferred until Codex supports it. - Phase 6. Language breadth — Python parser plugin for the graph, richer benchmark fixtures, and npm publish at 1.0.
- Phase 7. Agent operating layer — rule-pack generator, compaction-time memory hygiene, workflow MCP tools, session ledgers, and team-ready reports. See Tokenomy Next Features.
🤝 Contribute
Contributions welcome. Dependency-light (zero runtime deps in the hot hook path; @modelcontextprotocol/sdk loaded dynamically for the graph server; js-tiktoken optional peer for accurate analyze), test-first (409/409 currently green).
Good first issues:
| Level | What | Where |
|---|---|---|
| 🟢 | Fixture + rule-level test for an MCP tool you use (Asana, HubSpot, …) | tests/fixtures/ + tests/unit/mcp-content.test.ts |
| 🟢 | More analyze parser support (other Codex shapes, OpenCode, Aider) |
src/analyze/parse.ts |
| 🟡 | Built-in trim profile for another MCP server | src/rules/profiles.ts |
| 🟡 | More benchmark scenarios / captured fixtures | src/bench/ + docs/bench/ |
| 🔴 | Python parser plugin for the graph MCP server | new src/parsers/py/ |
Architecture tour:
src/
core/ — types, config (+ per-tool overrides), paths, gate, log, dedup, recovery hint
rules/ — pure transforms: mcp-content, read-bound, bash-bound, text-trim, profiles,
shape-trim, stacktrace, redact
hook/ — entry + dispatch + pre-dispatch (stdin → rule → stdout)
analyze/ — scanner (Claude Code + Codex), parse, tokens, simulate, report, render
graph/ — schema, build, stale detection, repo-id, query/{minimal,impact,review,usages,…}
mcp/ — stdio server, tool handlers, query-cache (LRU), budget-clip
parsers/ — TS/JS AST extraction
cli/ — init, doctor (+ --fix), uninstall, config-cmd, report, analyze, graph,
compress, bench, statusline, entry
util/ — settings-patch, manifest, atomic-write, backup, json helpers
tests/
unit/ — one file per module, ≥1 trim + ≥1 passthrough per rule
integration/ — spawn compiled dist/hook/entry.js or dist/cli/entry.js
fixtures/ — synthetic graph repos + synthesized transcripts for analyze
Rules are pure: (toolName, toolInput, toolResponse, config) → { kind: "passthrough" | "trim", ... }. New rule = one-file drop-in.
Development:
git clone https://github.com/RahulDhiman93/Tokenomy && cd Tokenomy
npm install && npm run build
npm test # 409 tests
npm run coverage # c8 → coverage/lcov.info + HTML
npm run typecheck # tsc --noEmit
npm link # point `tokenomy` at your local build
tokenomy doctor # all checks passing
# revert to published version: npm unlink -g tokenomy && npm install -g tokenomy
# or just: tokenomy update --force (from a fresh npm-installed shell)
Guiding principles. (1) Fail-open always — broken hook worse than no hook; never exit 2. (2) Schema invariants over trust — outputs never fabricate keys, flip types, shrink arrays. (3) Path-match over markers — uninstall identifies entries by absolute command path. (4) Measure before bragging — no "X % savings" claims without Phase 2 benchmark data. (5) Small + legible — a dozen lines aren't worth a 200 KB install.
Before opening a PR. npm test green (including new unit + integration). Touched a rule → add passthrough and trim tests. Touched init/uninstall → extend the round-trip integration test. Added a config field → document in the Configure block above. Questions? Open a discussion — design feedback is as welcome as code.
📜 License
MIT — see LICENSE.
Save tokens. Save money. Save the rainforest — or at least your API bill.
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found