token‑ninja

The Claude Code companion that pays zero tokens for commands your shell already knows how to run.

Built for Claude Code first. token-ninja hooks into Claude Code's
UserPromptSubmit event, intercepts commands like git status, npm test,
or docker ps before they become an API call, runs them locally, and
hands the result back to you — in the original colors, with a single dimmed
footer line acknowledging the save. The model is never invoked. Zero input
tokens, zero output tokens. Everything conversational flows through to
Claude untouched.

Other AI tools (Codex, Cursor, Aider, Gemini, Continue) are supported via
MCP, but the deepest, most invisible experience — the one that feels like
you're just using Claude normally — is Claude Code.

Install · Quickstart · How it works · Rules · MCP · Safety · Benchmarks

# Inside a Claude Code session — just chat as usual. The output appears
# exactly as if you had typed the command in your terminal, keeping its
# native colors. A single dimmed footer line reports the save. That's the
# whole visual change. Conversational prompts flow through to Claude
# unchanged.

you › git status
On branch main
nothing to commit, working tree clean
⚡ ninja · saved ~424 tokens · git-status

you › git branch --show-current
main
⚡ ninja · saved ~416 tokens · git-branch-list

you › git log --oneline -10
f77f852 chore(main): release 0.3.0
6932f36 feat(setup): auto-register ninja mcp with Gemini CLI
1b323b2 fix(router): preserve ANSI colors when the hook short-circuits the model
89394ec feat(hook): replace PreToolUse Bash hook with UserPromptSubmit
…
⚡ ninja · saved ~611 tokens · git-log-passthrough

you › explain why this stack trace is blowing up in production
# No match (conversational) — prompt flows to Claude unchanged.

No prefix. No new commands to learn. Keep chatting with Claude Code the
way you already do — token-ninja quietly handles the boring stuff and
gets out of the way for everything else.

Why token-ninja
Install
Quickstart
How it works
Features
Supported AI tools
Write your own rules
Natural-language commands
MCP integration
Configuration
Safety model
Commands
Benchmarks
Development
FAQ
Contributing
License

Why token-ninja

Every trip to an LLM costs tokens, dollars, and seconds of latency. Yet a huge
share of what we ask AI coding assistants is utterly deterministic — listing
files, checking git status, running tests, showing recent commits. Those calls
don't need a model. They need a shell.

	Plain AI assistant	token-ninja
`git status`	~400 tokens	0 tokens, ~10 ms
`docker ps`	~450 tokens	0 tokens, ~10 ms
`show recent commits`	~500 tokens	0 tokens, ~15 ms
`build the project` (auto-detects `npm` / `cargo` / `go` / …)	~700 tokens	0 tokens, ~20 ms
`rm -rf /`	runs if model agrees	blocked before exec
`explain this stack trace: …`	~2–5 k tokens	passes straight through

The classifier runs in ~19 µs. The safety validator runs in ~10 µs.
Neither will ever be the slow part of your day.

Install

One line. Zero config. Starts working on your next AI session.

npm install -g token-ninja

A postinstall hook registers ninja mcp as an MCP server in every AI client
it can find on your machine — Claude Code (~/.claude.json), Cursor
(~/.cursor/mcp.json), and Claude Desktop — so the next time you open your
AI tool, it already knows to consult token-ninja before spending tokens on
commands like git status, npm test, or docker ps.

For Claude Code specifically, the same postinstall also writes a
UserPromptSubmit hook into ~/.claude/settings.json. That event fires
before your prompt turns into an API call — if token-ninja recognizes it
with high confidence (exact or prefix match), the hook executes locally
and short-circuits the model entirely. The prompt is never sent, the
response is never generated: zero input tokens, zero output tokens.
Anything conversational flows through to Claude untouched.

Existing MCP entries are preserved, each file is backed up once
(*.token-ninja.bak) before the first write, and malformed configs are
skipped safely instead of failing the install.

Requirements: Node ≥ 20.

Opt out of the postinstall hook entirely:
TOKEN_NINJA_SKIP_POSTINSTALL=1 npm install -g token-ninja

Roll back any time: ninja uninstall — removes the MCP entry from
every client config it wrote to, and the UserPromptSubmit hook from
~/.claude/settings.json.

Quickstart

There's no new command to learn. After npm install -g token-ninja,
open Claude Code and start chatting. Commands render with their native
ANSI colors; the footer is dimmed so it recedes.

you › git status
On branch main
nothing to commit, working tree clean
⚡ ninja · saved ~424 tokens · git-status

you › npm test
 Test Files  21 passed (21)
      Tests  312 passed (312)
   Duration  4.16s
⚡ ninja · saved ~2,362 tokens · npm-run-known

you › docker ps
CONTAINER ID   IMAGE         STATUS         NAMES
a7f3c9e21b4d   postgres:16   Up 2 hours     db
51e2d7f0a8c6   redis:7       Up 2 hours     cache
⚡ ninja · saved ~452 tokens · docker-ps

you › git diff
(no changes)
⚡ ninja · saved ~402 tokens · git-diff

you › why is my React state not updating when I click the button?
# No match (conversational) — token-ninja passes through. The model answers normally.

Check how many tokens you've saved at any time:

ninja stats

Handy extras:

ninja rules test "your command" — dry-run the classifier against any
input (no execution) and see which rule would fire.
ninja setup — re-run auto-setup; ninja setup --dry-run previews without
writing; ninja setup --no-mcp skips MCP registration.
ninja uninstall — remove everything token-ninja added.

How it works

your input
    │
    ▼
┌────────────────────┐    blocked?    ┌──────────────────────┐
│  safety validator  │ ─────────────► │  fall back to AI     │
└────────────────────┘                │  (let a human review) │
    │ allowed                         └──────────────────────┘
    ▼
┌────────────────────┐    no match    ┌──────────────────────┐
│   classifier       │ ─────────────► │  fall back to AI     │
│ exact → prefix →   │                │  (pass unchanged)    │
│ regex → NL         │                └──────────────────────┘
└────────────────────┘
    │ match
    ▼
┌────────────────────┐    blocked?    ┌──────────────────────┐
│  safety (again, on │ ─────────────► │  fall back to AI     │
│ resolved command)  │                └──────────────────────┘
└────────────────────┘
    │ allowed
    ▼
┌────────────────────┐
│  exec in your      │  ──► stdout / stderr
│  shell, record hit │  ──► ninja saved ~N tokens
└────────────────────┘

Match order is strict: exact → prefix → regex → natural-language keywords.
The first confident match wins. Safety is checked twice — on the raw input
and on the resolved command — so template expansion can never smuggle a
dangerous command past the classifier.

Features

Claude-Code-native UX: the UserPromptSubmit hook fires before your
prompt becomes an API call. On a confident match the hook short-circuits
the model and hands back the captured output in its original ANSI
colors, followed by a single dimmed footer line (⚡ ninja · saved ~N tokens · rule-id). No banner, no prefix, no new commands — the
interaction looks exactly like you ran the command in your terminal.
Huge built-in rule set: 736 rules across 46 tool domains covering
thousands of real commands — git (+ plumbing), GitHub CLI, npm, pnpm,
yarn, bun, cargo, go, rust, java, kotlin, python, ruby, php, docker,
kubernetes, database, network, filesystem, archive, process management,
test runners, linters, text processing, build tools (make, just, task,
mage, bazel, buck2, cmake, ninja, meson), modern CLI (ripgrep, fd, bat,
eza, jq, yq, fzf, delta, hyperfine, tldr, btop, glow), editors (VS
Code, Cursor, Nvim, JetBrains, Helix, Zed), system info, cloud CLIs
(AWS, Azure, gcloud, Vercel, Netlify, Heroku, Fly, Railway, doctl),
IaC (Terraform, Ansible, Vagrant, Pulumi, Packer, CDK), bundlers
(Vite, Turbo, esbuild, Parcel, Rollup, Webpack, Rspack, tsup, Nx),
container tools (docker, podman, nerdctl, buildah, skopeo, crane,
dive, trivy, cosign, grype, syft), Kubernetes ecosystem (kubectl,
kubectx/kubens, k9s, kind, minikube, k3d, colima, stern, velero,
kubeseal), gitops (argocd, flux, skaffold, tilt, fastlane,
pre-commit), helm + kustomize + helmfile, secrets / infra
(vault, nomad, packer, sops, age, 1password, bitwarden), Deno, Elixir,
Dart/Flutter, process supervisors, env managers, distributed systems,
and natural-language mappings. Run ninja rules list to see everything
loaded.
Fast: ~19 µs per classification, ~10 µs per safety check (warm JIT).
Safe by construction: layered deny-list blocks rm -rf /, sudo,
git push --force, DROP TABLE, curl | sh, dd if=, mkfs, …
including homoglyph, NFKC, chained, and base64-decoded evasion. Safety
is validated twice: once on the raw prompt and once on the expanded
command.
Battle-tested: 312 tests across 21 files covering classifier edge
cases, safety bypasses, conversational look-alikes, hook safeguards, and
≥99% rule coverage over 1090 real-world fixture commands. v8 coverage
holds at 90%+ lines / 95%+ functions on all router/safety/rules code.
Zero-setup: npm install -g is literally the whole install. A
postinstall hook registers the UserPromptSubmit hook in Claude Code
and the MCP server in every other AI client it detects. ninja uninstall reverses it.
MCP-native: exposes maybe_execute_locally over stdio so agents
(Codex, Cursor, Aider, Gemini, Continue, any MCP client) can consult the
router before generating tokens.
Pluggable: drop a .yaml into ~/.config/token-ninja/rules/ to add
your own patterns. User rules override builtins by id.
Telemetry built in: ninja stats shows hit rate, top rules, and an
estimate of the tokens you've saved to date.
Dry-run friendly: ninja rules test "…" shows which rule would fire
for any input, without executing anything.
Escape hatches: prefix a prompt with ?, /raw, or /claude to
bypass the hook for that message. Or set
intercept_user_prompts: false in ~/.config/token-ninja/config.yaml
to disable interception globally while keeping the MCP integration
active.

Supported AI tools

First-class (zero-token interception):

Tool	Integration
Claude Code	`UserPromptSubmit` hook in `~/.claude/settings.json` + MCP in `~/.claude.json`

The hook is the headline feature. It sees every user prompt before it
becomes an API call and can short-circuit the model entirely — which is
why Claude Code is where you'll see the biggest token savings.

MCP-supported (agent opts in by calling the tool):

Tool	Integration
Cursor	`~/.cursor/mcp.json`
Claude Desktop	`~/Library/Application Support/Claude/claude_desktop_config.json` (macOS), `%APPDATA%\Claude\…` (Windows), `$XDG_CONFIG_HOME/Claude/…` (Linux)
Codex / Aider / Gemini / Continue	Register `ninja mcp` as a stdio MCP server; `ninja setup` auto-detects what's installed.

Any other MCP-capable client works too — point it at ninja mcp and
you're in. Installed a new AI tool later? Re-run ninja setup; it's
idempotent.

Write your own rules

Rules are plain YAML. Drop a file into ~/.config/token-ninja/rules/*.yaml:

domain: myteam
rules:
  - id: deploy-staging
    match:
      type: exact
      patterns:
        - "deploy staging"
        - "ship to staging"
    action:
      type: shell
      command: "./scripts/deploy.sh staging"
    safety: write-network

  - id: show-routes
    match:
      type: nl
      keywords:
        - ["show", "routes"]
        - ["list", "routes"]
    action:
      type: shell
      command: "rails routes | head -50"
    safety: read-only

  - id: run-script
    match:
      type: prefix
      patterns: ["run script"]
    action:
      type: shell
      command: "./scripts"
      args_passthrough: true
    safety: write-confined

See src/rules/builtin/*.yaml for 736 production-grade
examples across 46 domains.

Match type	When to use
`exact`	The input must equal one of the patterns (hash-indexed, O(1)).
`prefix`	The input starts with one of the patterns; longest match wins.
`regex`	Arbitrary capture groups. Used for templated commands.
`nl`	Natural-language keyword groups, e.g. `["show", "recent", "commits"]`.

Safety tier	Means
`read-only`	Cannot modify the user's filesystem.
`write-confined`	Writes only inside CWD / config dir / build outputs.
`write-network`	May reach the network (e.g. `npm install`).
`blocked`	Never execute locally; always fall back to AI.

Template variables available in command:
{{input}}, {{args}}, {{arg1}} … {{arg9}}, {{message}}, {{branch}},
{{target}}, {{path}}, {{script}}, {{pkg}}.

Natural-language commands

Many built-in rules match plain English, not just shell syntax:

You type	Ninja runs
`show recent commits`	`git log --oneline -20`
`what branch am I on`	`git branch --show-current`
`list docker containers`	`docker ps`
`what's using port 3000`	`lsof -i :3000`
`build the project`	auto-detects `npm` / `pnpm` / `cargo` / `go` / …
`run the tests`	auto-detects the test runner

Use ninja rules test "your command" to dry-run the classifier against any input.

MCP integration

token-ninja talks to most AI tools through the Model Context Protocol: it
exposes a single stdio tool (maybe_execute_locally) that the agent calls
on every command it's about to run. If token-ninja recognizes the command,
it answers with the output directly; if not, the agent proceeds as usual.

Claude Code: UserPromptSubmit hook (real savings)

The MCP server alone doesn't save tokens in Claude Code because the model
rarely consults MCP tools before the Bash built-in, and once a tool result
is in context it counts the same tokens whether it came from Bash or an
MCP call. To save real tokens we have to intercept before the prompt
becomes an API call. That's what the UserPromptSubmit hook does:

{
  "hooks": {
    "UserPromptSubmit": [
      {
        "hooks": [
          { "type": "command", "command": "node /abs/path/to/hooks/claude-code-user-prompt.cjs" }
        ]
      }
    ]
  }
}

Every prompt you type passes through ninja route --strict first. If a
high-confidence rule matches, the command runs locally and the captured
output is rendered back to you — the model is never invoked, no input
tokens consumed, no output tokens generated. If nothing matches, the
prompt flows to Claude unchanged. A real turn-level save, not a cosmetic
one.

Safeguards against mis-interception. Conversational prompts should
never be hijacked, so the hook layers five filters:

Strict routing — only exact and prefix matches (no NL, no regex).
Length cap — prompts longer than 80 chars skip (literal commands are short).
Keyword blocklist — explain, why, how, review, suggest,
teach, help me, should i, tell me about, walk me through.
Escape prefixes — start a prompt with ?, /raw, or /claude to
force it through to the model this once.
Global opt-out — set intercept_user_prompts: false in
~/.config/token-ninja/config.yaml.

Install control:

ninja setup --no-hook — skip just the prompt hook (MCP still registered).
ninja setup --no-mcp — skip just MCP (hook still installed).
ninja uninstall — removes both, plus any shell-rc shims, plus any
legacy PreToolUse Bash entry left by older versions.

You don't need to configure this manually. ninja setup (the postinstall
hook) merges an entry like the one below into:

~/.claude.json (Claude Code)
~/.cursor/mcp.json (Cursor)
~/Library/Application Support/Claude/claude_desktop_config.json (Claude
Desktop, macOS — Windows and Linux paths are handled too)

{
  "mcpServers": {
    "token-ninja": {
      "command": "ninja",
      "args": ["mcp"]
    }
  }
}

Existing entries are preserved; each file is backed up once
(*.token-ninja.bak) before the first modification. To opt out:
ninja setup --no-mcp. To remove just the MCP entries: ninja uninstall.

If you still want to do it yourself — e.g. a project-local .mcp.json or an
MCP client we don't know about — the manual command is:

ninja mcp    # stdio server exposing maybe_execute_locally

Each call the model makes looks like:

// handled locally
{ "handled": true, "stdout": "…", "stderr": "…", "exit_code": 0,
  "rule_id": "git-status", "tokens_saved_estimate": 512 }

// AI should handle it
{ "handled": false, "reason": "no_match" }

Configuration

You don't need to touch this to get started — the defaults work out of the
box. If you want to tune things:

~/.config/token-ninja/config.yaml

custom_rules_dir: ~/.config/token-ninja/rules  # where your own rules live
stats:
  enabled: true
  show_savings_on_exit: true                   # "⚡ handled by token-ninja" line
  verbose: false

Environment variables:

Variable	Effect
`TOKEN_NINJA_SKIP_POSTINSTALL=1`	Skip the automatic setup on `npm i -g`.
`CLAUDE_CONFIG_PATH=<path>`	Override the Claude Code config path used by `ninja setup`.
`XDG_CONFIG_HOME`	Honored for the token-ninja config dir and Claude Desktop on Linux.

Safety model

Every input is split into pipeline segments and each segment is tested against
a hard deny-list (see src/safety/denylist.ts).

What we block:

rm -rf on any system path
privilege escalation (sudo, doas)
remote-code-execution pipes (curl | sh, wget | bash, curl | python)
disk destroyers (dd if=, mkfs, > /dev/sd*)
git footguns (push --force — but not --force-with-lease; reset --hard)
SQL footguns (DROP TABLE, DELETE / UPDATE without WHERE)
container / cluster footguns (docker system prune -af, kubectl delete
without --dry-run)
evasion tricks: homoglyph lookalikes (ѕudo with Cyrillic ѕ), NFKC
normalization attacks, chained && / ; / |, quoted / back-ticked
substitution, base64 decode piped to a shell

Deny-listed inputs never execute locally. They fall back to the AI, where a
human can review the explanation before anything runs.

Commands

You almost never need these — setup is automatic. Kept for diagnostics and
power users.

ninja setup [--dry-run] [--no-mcp] [--tool …]
                              auto-register token-ninja with every AI client
                              it can detect (the postinstall default)
ninja uninstall               undo setup; remove MCP entries from client configs
ninja mcp                     run the stdio MCP server (what the AI tool calls)
ninja stats [--json] [--reset]
                              see tokens saved, top rules, hit rate
ninja rules list [--domain …] [--json]
ninja rules test <input…>     dry-run the classifier against an input

Benchmarks

token-ninja is a shell-adjacent tool — correctness and safety are
non-negotiable. The test suite is the safety net.

Metric	Value
Test files	17
Tests	234 (all passing)
Line coverage	92.2% (threshold: 85%)
Branch coverage	84.3% (threshold: 80%)
Function coverage	95.2% (threshold: 95%)
Statement coverage	92.2% (threshold: 85%)
Real-command fixture hit-rate	100% on 657 commands (floor: 85%)
`classify()` benchmark	~19 µs/call (10 k in < 800 ms)
`validate()` benchmark	~10 µs/call (10 k in < 100 ms)

Coverage is enforced by vitest + @vitest/coverage-v8 against
src/router/**, src/safety/**, and src/rules/**.

CI gates (see .github/workflows/ci.yml):

lint — ESLint flat config, typed rules
typecheck — tsc --noEmit
build — emits dist/, copies YAML rules, runs npm pack --dry-run
test — Node 20 & 22 on ubuntu-latest, plus Node 20 on macos-latest
coverage — uploaded as a workflow artifact

Benchmark assertions scale automatically on CI (BENCH_FACTOR auto-detected);
run BENCH_FACTOR=1 npm test locally for strict regression numbers, or set
SKIP_BENCH=1 to treat benchmarks as informational.

Development

git clone https://github.com/oanhduong/token-ninja
cd token-ninja
npm install

npm run lint             # eslint flat config
npm run typecheck        # tsc --noEmit
npm run build            # tsc + copy YAML rules to dist/
npm test                 # vitest run, 218 tests
npm run test:watch       # watch mode
npm run test:coverage    # v8 coverage, thresholds enforced

Handy development commands:

# Dry-run the classifier (no execution)
npx tsx src/cli.ts rules test "your command"

# Full router dry-run (prints the resolved command)
npx tsx src/cli.ts --dry-run "your command"

# List built-in rules
npx tsx src/cli.ts rules list --domain git

FAQ

What does the postinstall hook actually change on my machine?
It merges a single token-ninja entry into each AI client's MCP config
file — ~/.claude.json, ~/.cursor/mcp.json, and Claude Desktop's
OS-specific config. Nothing else is touched. Every file it writes is backed
up once (*.token-ninja.bak) before the first modification, and every
unrelated key is preserved. Run ninja uninstall to remove it, or set
TOKEN_NINJA_SKIP_POSTINSTALL=1 at install time to skip the hook entirely.

Does it work if I don't use Claude Code / Cursor / Claude Desktop?
Yes — any MCP-capable client works. Point its server config at
ninja mcp. Non-MCP tools can still use token-ninja as a library or via
the maybe_execute_locally stdio protocol directly.

What happens if a rule misclassifies my command?
Nothing dangerous: safety is checked twice, and any resolved command that
doesn't match its declared safety tier is blocked — the agent is told
"handle this yourself" and takes over as if token-ninja weren't there.
If the match itself is wrong (e.g. prints the wrong thing), disable that
rule by shadowing its id in a file under ~/.config/token-ninja/rules/.

How is "tokens saved" calculated?
Each rule carries a tokens_saved_estimate, or we estimate from input
length + captured output + a 400-token system-prompt overhead. See
src/telemetry/stats.ts.

Does it work on Windows?
Yes. The router and MCP server run anywhere Node 20+ runs; the auto-setup
handles the Windows Claude Desktop config path (%APPDATA%\Claude\…) out
of the box.

Contributing

See CONTRIBUTING.md. New rules are always welcome — the
fastest way to help is to browse
tests/fixtures/real-commands.txt for
commands that currently miss and add a rule covering them.

Pick the narrowest match type (exact > prefix > regex > nl).
Pick the right safety tier (read-only < write-confined < write-network
< blocked).
Add at least one fixture line to tests/fixtures/real-commands.txt.
npm test — the coverage suite enforces a ≥ 85 % hit rate on fixtures.

Security issues: see SECURITY.md. Community norms:
CODE_OF_CONDUCT.md.

License

If token-ninja saved you tokens today, consider dropping a
star on GitHub —
it's how the next person finds us.

token-ninja