token-reducer
Health Uyari
- License — License: MIT
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Low visibility — Only 5 GitHub stars
Code Gecti
- Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Gecti
- Permissions — No dangerous permissions requested
This tool is a local-first context compression pipeline designed to reduce token usage for Claude Code by up to 90%. It uses code parsing and hybrid search algorithms to analyze your codebase and extract only the relevant context before sending it to the AI.
Security Assessment
Overall Risk: Low
The automated code scan reviewed 12 files and found no dangerous patterns, hardcoded secrets, or requests for dangerous system permissions. The tool runs entirely locally, meaning it does not make external network requests or send your proprietary code to third-party servers. It relies on AST chunking and local SQLite databases to process data, keeping your codebase strictly on your machine.
Quality Assessment
The codebase is actively maintained, with the most recent push occurring today. It is properly licensed under the permissive and standard MIT license. However, the project currently suffers from extremely low community visibility and adoption, having accumulated only 5 GitHub stars. Because of this limited external scrutiny, potential bugs or edge-case vulnerabilities may not yet be identified by the wider developer community.
Verdict
Safe to use, though you should apply standard caution expected with early-stage, low-visibility community projects.
⚡ Cut Claude token usage by 90%+ — free, open-source, local-first context compression for Claude Code. Hybrid RAG (BM25 + ONNX vectors), AST chunking, reranking. No API needed.
Token Reducer
Cut Claude API costs by 90%+ with intelligent context compression
The open-source alternative to expensive context management tools.
The Problem
Every time you use Claude with a large codebase, you're paying for thousands of tokens that aren't relevant to your query. Most context management tools either:
- Send everything (expensive)
- Truncate blindly (loses important context)
- Require heavy Language Servers (slow, resource-intensive)
The Solution
Token Reducer is a local-first, intelligent context compression pipeline that:
- Reduces tokens by 90-98% while preserving semantic relevance
- Runs entirely locally — no API calls, no data leaving your machine
- Works in milliseconds — faster than Language Server alternatives
- Understands code semantically — AST parsing, not just text matching
┌─────────────────┐ ┌───────────────┐ ┌──────────────────┐
│ Your Codebase │────▶│ Token Reducer │────▶│ Compressed │
│ (50,000 tokens)│ │ Pipeline │ │ Context (500t) │
└─────────────────┘ └───────────────┘ └──────────────────┘
│
┌─────────┴─────────┐
│ - AST Chunking │
│ - BM25 + Vector │
│ - TextRank │
│ - Import Graph │
│ - 2-Hop Symbols │
└───────────────────┘
Easy Install
Option 1 — Claude Code /plugin Command (Recommended)
Step 1: Register the marketplace (one-time setup):
/plugin marketplace add Madhan230205/token-reducer
This registers the marketplace as Madhan230205-token-reducer.
Step 2: Install:
/plugin install token-reducer@Madhan230205-token-reducer
For project-scoped install:
/plugin install token-reducer@Madhan230205-token-reducer --scope project
Already ran Step 1 before? Just run
/plugin install token-reducer@Madhan230205-token-reducer— no need to add the marketplace again.
Option 2 — Git Clone (Manual)
# 1. Clone into your Claude plugins folder
git clone https://github.com/Madhan230205/token-reducer.git ~/.claude/plugins/token-reducer
# 2. Install dependencies (optional but recommended for best results)
pip install -r ~/.claude/plugins/token-reducer/requirements-optional.txt
Windows users: Replace
~/.claude/plugins/with%USERPROFILE%\.claude\plugins\
Then open ~/.claude/settings.json and add:
{
"plugins": ["~/.claude/plugins/token-reducer"]
}
Restart Claude Code. Done.
What requirements-optional.txt installs:
| Package | Purpose |
|---|---|
sentence-transformers |
Neural embeddings for smarter retrieval |
hnswlib / faiss-cpu |
Fast approximate nearest-neighbor search |
tree-sitter + language grammars |
AST-based code chunking (Python, JS, TS, Go, Rust, Java, C/C++, Ruby) |
If you skip this step, Token Reducer still works using hash embeddings and regex chunking — no ML libraries required.
Option 3 — Zero-Dependency Quick Start
No pip, no ML libs — runs immediately after cloning:
git clone https://github.com/Madhan230205/token-reducer.git
cd token-reducer
python scripts/context_pipeline.py run \
--inputs ./src \
--query "Find auth logic" \
--embedding-backend hash \
--db .cache/index.db
Features
Core Pipeline
- Hybrid Retrieval — BM25 + semantic vector search with intelligent fallback
- AST-Based Chunking — Tree-sitter parsing for Python, TypeScript, Go, Rust, Java, and more
- TextRank Compression — Graph-based sentence scoring for intelligent summarization
- Sub-100ms Queries — SQLite FTS5 + HNSW indexes for instant results
- Local-First — Everything runs on your machine, no external APIs
LSP-Killer Features
- Import Graph — Automatically maps file dependencies without Language Server
- 2-Hop Symbol Expansion — Auto "go-to-definition" for referenced functions
- Diff Protocol — SEARCH/REPLACE edit format with automatic application
- Semantic Clustering — Groups similar chunks to avoid redundancy
Enterprise Ready
- Fully Configurable — 40+ tunable parameters in
settings.json - Embedding Flexibility — ML models or hash fallback (zero dependencies)
- Query Caching — Intelligent TTL-based caching for repeated queries
- Session Memory — Tracks context across conversation turns
Documentation
How It Works
Query → FTS(BM25) → (Vector fallback if needed) → Merge → Top 5 → Compress
Full pipeline:
PREPROCESS → INDEX → RETRIEVE → RE-RANK → COMPRESS → CONTEXT PACKET
Basic Usage
# Index your codebase
python scripts/context_pipeline.py index --inputs ./src --db .cache/index.db
# Query with compression
python scripts/context_pipeline.py query \
--query "How does authentication work?" \
--db .cache/index.db \
--json
# One-shot: index + query
python scripts/context_pipeline.py run \
--inputs ./src \
--query "Find the database connection logic" \
--db .cache/index.db
Configuration
All settings in settings.json:
{
"tokenReducer": {
"chunkSizeWords": 220,
"embeddingModel": "jinaai/jina-embeddings-v2-base-code",
"hybridMode": "fallback",
"astChunkingEnabled": true,
"textRankEnabled": true,
"lspFeatures": {
"importGraphEnabled": true,
"twoHopExpansionEnabled": true
}
}
}
Full Configuration Reference
| Setting | Default | Description |
|---|---|---|
chunkSizeWords |
220 | Target words per chunk |
embeddingBackend |
"ml" | "ml" for neural, "hash" for zero-dep |
embeddingModel |
jina-v2-code | Code-optimized embeddings |
hybridMode |
"fallback" | "fallback" or "always" for vector |
astChunkingEnabled |
true | Use tree-sitter AST parsing |
textRankEnabled |
true | Graph-based sentence scoring |
importGraphEnabled |
true | Track file dependencies |
twoHopExpansionEnabled |
true | Auto-expand referenced symbols |
compressionWordBudget |
350 | Max words in compressed output |
Zero-Dependency Mode
Run without any ML libraries:
python scripts/context_pipeline.py run \
--inputs ./src \
--query "Find auth logic" \
--embedding-backend hash \
--db .cache/index.db
Apply Code Edits
python scripts/apply_diff.py --input claude_response.txt --dir ./src
python scripts/apply_diff.py --input response.txt --dry-run
Architecture
Technology Stack
- Storage: SQLite with FTS5 + custom embeddings table
- Chunking: Tree-sitter AST parsing with regex fallback
- Embeddings: Jina Code v2 (or zero-dependency hash embeddings)
- ANN Search: HNSW via hnswlib (with FAISS fallback)
- Compression: TextRank + query-relevance scoring
Repository Structure
token-reducer/
├── .claude-plugin/plugin.json
├── .mcp.json
├── .env.example
├── settings.json
├── requirements-optional.txt
├── scripts/
├── hooks/
├── commands/
├── agents/
├── skills/
└── evals/
Contributing
If anyone is interested in contributing, this project is open to contributions.
Please see contribute.md for contribution guidelines.
git clone https://github.com/Madhan230205/token-reducer.git
cd token-reducer
pip install -e ".[dev]"
python scripts/context_pipeline.py self-test
License
MIT License — see LICENSE for details.
Acknowledgments
- Tree-sitter for AST parsing
- Sentence Transformers for embeddings
- SQLite FTS5 for blazing-fast text search
- hnswlib for approximate nearest neighbors
Star this repo if Token Reducer saves you money!
Yorumlar (0)
Yorum birakmak icin giris yap.
Yorum birakSonuc bulunamadi