claude-cost-optimizer
Save 40-70% on Claude Code costs — proven strategies, real benchmarks, copy-paste configs
Claude Cost Optimizer
Save 30-60% on Claude Code costs with proven strategies, real benchmarks, and copy-paste configs.
Claude Code is powerful - but costs add up fast. A single afternoon of heavy coding can burn through $20-50 in tokens. Most of this spend is avoidable with the right setup.
This repo is a collection of battle-tested strategies for reducing Claude Code costs without sacrificing quality. Every technique includes expected savings percentages based on real-world benchmarks.
Quick Wins (Start Here)
Apply these 5 changes right now and cut costs immediately:
| # | Strategy | Expected Savings | Effort | Guide |
|---|---|---|---|---|
| 1 | Keep CLAUDE.md under 150 lines - every line loads on every turn | 10-20% | 5 min | Context Optimization |
| 2 | Use Haiku for simple tasks (--model haiku) - 5x cheaper than Opus |
20-40% | 1 min | Model Selection |
| 3 | Use Plan Mode before coding - prevents wasted iterative cycles | 15-25% | 0 min | Workflow Patterns |
| 4 | Add .claudeignore - stop Claude from reading node_modules, dist, lock files |
5-15% | 2 min | Context Optimization |
| 5 | Delegate to subagents - isolate expensive searches from main context | 20-40% | 5 min | Workflow Patterns |
Combined impact: 30-60% reduction in monthly Claude Code spend.
Table of Contents
How Claude Billing Works
Understanding the billing model is the foundation of optimization.
Token Pricing (as of March 2026)
| Model | Input (per 1M tokens) | Output (per 1M tokens) | Cache Hit (per 1M) | Context Window | Max Output |
|---|---|---|---|---|---|
| Opus 4.6 | $5.00 | $25.00 | $0.50 | 1M | 128K |
| Opus 4.6 (1M, >200K input) | $10.00 (2x) | $37.50 (1.5x) | $1.00 | 1M | 128K |
| Opus 4.6 Fast Mode | $30.00 (6x) | $150.00 (6x) | -- | 1M (included) | 128K |
| Sonnet 4.6 | $3.00 | $15.00 | $0.30 | 1M | 64K |
| Haiku 4.5 | $1.00 | $5.00 | $0.10 | 200K | 64K |
New in 2.1+:
--max-budget-usd <amount>caps spending per session.--fallback-model <model>auto-switches to a cheaper model when the primary is overloaded.
Plans: Pro $20/mo, Max 5x $100/mo, Max 20x $200/mo. Batch API: 50% discount. Cache write: 1.25x (5-min TTL), 2x (1-hour TTL).
Off-Peak 2x Usage Events
Anthropic periodically runs temporary promotions that double usage limits during off-peak hours. These are not permanent -- check the Anthropic blog and support page for current promotions.
When active, peak hours (normal limits) are typically 8 AM - 2 PM ET. Everything outside that window + all weekends = 2x usage.
Time zone breakdown (click to expand)| Time Zone | Peak (Normal Limits) | 2x Usage Window |
|---|---|---|
| US West (PT) | 5-11 AM | 11 AM - 5 AM + weekends |
| US East (ET) | 8 AM - 2 PM | 2 PM - 8 AM + weekends |
| UK (BST) | 1-7 PM | 7 PM - 1 PM + weekends |
| Central Europe (CET) | 2-8 PM | 8 PM - 2 PM + weekends |
| India (IST) | 6:30 PM - 12:30 AM | Entire workday is 2x |
| China/Singapore (SGT) | 9 PM - 3 AM | Entire workday is 2x |
| Japan/Korea (JST) | 10 PM - 4 AM | Entire workday is 2x |
| Australia (AEDT) | 12-6 AM | Entire workday is 2x |
Key insight: If you're outside the US, your entire workday typically falls in the 2x window during these promotions.
What Counts as Tokens
Every Claude Code turn = Input Tokens + Output Tokens
Input Tokens (you pay for):
├── System prompt (Claude Code's built-in instructions)
├── CLAUDE.md file (loaded every turn)
├── Conversation history (grows with each turn)
├── File contents (from Read/Grep/Glob results)
├── Tool call results
├── MCP server tool schemas (each server adds ~500-2000 tokens)
└── MCP server responses
Output Tokens (you pay more for):
├── Claude's text responses
├── Tool calls (Read, Edit, Bash, etc.)
├── Code generation
└── Plan mode analysis
The Key Insight
Input tokens are charged every turn. If your CLAUDE.md is 300 lines and you have 50 turns in a session, you're paying for those 300 lines x 50 times. Cutting it to 100 lines saves you 2/3 of that recurring cost.
Prompt Caching
Claude Code uses prompt caching to reduce costs. Cached input tokens cost significantly less (e.g., $0.50/MTok vs $5.00/MTok on Opus 4.6 - a 90% discount). Content that stays the same between turns (like CLAUDE.md and system prompt) gets cached automatically.
What this means: The first turn is expensive, subsequent turns benefit from caching. Avoid breaking the cache by keeping stable content (CLAUDE.md, file reads) consistent between turns.
Guides
Deep-dive guides for each optimization area:
| Guide | What You'll Learn |
|---|---|
| 01 - Understanding Costs | How billing works, what costs the most, where money goes |
| 02 - Context Optimization | Reduce input tokens: CLAUDE.md, .claudeignore, file reads |
| 03 - Model Selection | When to use Opus vs Sonnet vs Haiku (with decision tree) |
| 04 - Workflow Patterns | Plan mode, subagents, commands, batch operations |
| 05 - Team Budgeting | Per-developer budgets, cost tracking, ROI calculation |
| 06 - Access Methods & Pricing | Compare API vs Bedrock vs Vertex AI vs Claude Code pricing |
| 07 - MCP & Agent Cost Impact | MCP server overhead, subagent costs, Agent SDK patterns |
Benchmarks
Real-world cost measurements for common development tasks:
| Benchmark | What's Compared |
|---|---|
| Task Comparison | Same task with/without optimization - before vs after |
| Model Comparison | Opus vs Sonnet vs Haiku for different task types |
| Context Size Impact | How CLAUDE.md size and file reads affect total cost |
All benchmarks include methodology, raw numbers, and reproducible steps.
Templates
Copy-paste configs that are already optimized. Drop these into your project:
CLAUDE.md Templates
| Template | Lines | Best For | Link |
|---|---|---|---|
| Minimal | <50 | Maximum savings, solo projects | minimal.md |
| Standard | ~100 | Balanced cost/quality | standard.md |
| Comprehensive | ~150 | Full-featured, team projects | comprehensive.md |
| Monorepo | ~120 | Multi-package workspaces | monorepo.md |
Stack-Specific Templates
| Stack | Link |
|---|---|
| React + Vite | react-vite.md |
| Next.js | nextjs.md |
| FastAPI + Python | fastapi-python.md |
| MERN Stack | mern.md |
| Terraform + AWS | terraform-aws.md |
Settings Configs
| Config | Philosophy | Link |
|---|---|---|
| Cost-Conscious | Haiku default, strict permissions, turn limits | cost-conscious.json |
| Balanced | Sonnet default, sensible permissions | balanced.json |
| Performance-First | Opus default, minimal restrictions | performance-first.json |
Claude Commands
| Command | Purpose | Link |
|---|---|---|
/cost-check |
Check current session usage | cost-check.md |
/budget-mode |
Force cost-conscious behavior | budget-mode.md |
/quick-fix |
Minimal-token bug fix | quick-fix.md |
Tools
Token Estimator
Estimate how many tokens a file or prompt will cost before sending it to Claude:
python tools/token-estimator/estimate.py path/to/file.py
# Output: ~1,247 tokens | Estimated cost: $0.004 (Sonnet input)
python tools/token-estimator/estimate.py path/to/CLAUDE.md --per-turn 50
# Output: ~890 tokens per turn | 50 turns = 44,500 tokens | $0.13 (Sonnet)
Usage Analyzer
Analyze your Claude Code usage patterns to find cost hotspots:
python tools/usage-analyzer/analyze.py ~/.claude/projects/
# Output: Cost breakdown by project, session, and action type
Cheatsheet
One-page reference: cheatsheet.md
Contributing
We welcome contributions! See CONTRIBUTING.md for guidelines.
Ways to contribute:
- Share a tip: Open an issue using the Tip Submission template
- Add a benchmark: Run a test and submit results using the Benchmark Result template
- Add a template: Submit a CLAUDE.md template for a new stack
- Improve guides: Fix errors, add examples, clarify explanations
- Build a tool: Add utilities that help track or reduce costs
FAQ
How much does Claude Code actually cost?
With the Pro plan ($20/mo), Max 5x ($100/mo), or Max 20x ($200/mo), you get included usage. After that, costs depend on your model and token usage. Heavy users report $3-15/day without optimization. With optimization, most drop to $1-5/day. Note that Opus 4.6 is now priced at $5/$25 - the same price Sonnet used to be - making top-tier model usage much more affordable.
Does this apply to Claude API too?
Many strategies (context optimization, model selection, prompt engineering) apply to both Claude Code and the API. The templates and commands are Claude Code-specific.
Will these optimizations reduce output quality?
No. The strategies focus on eliminating waste - duplicate context, unnecessary file reads, using expensive models for simple tasks. Quality stays the same or improves (because Claude has less noise to process).
How do I track my spending?
Use the /usage command in Claude Code to see current session stats. For historical tracking, use our Usage Analyzer tool.
What's the biggest single change I can make?
Switch to Haiku for routine tasks (formatting, simple fixes, file lookups). It's 5x cheaper than Opus and handles 70% of common coding tasks well. With Opus 4.6 now at $5/$25 (the same price Sonnet used to be), even the top model is much more accessible - but Haiku at $1/$5 still adds up to meaningful savings at scale.
Star History
If this repo helped you save money, consider giving it a star! It helps others find these resources.
License
MIT - use these strategies, templates, and tools however you want.
Yorumlar (0)
Yorum birakmak icin giris yap.
Yorum birakSonuc bulunamadi