token-reducer

skill
Guvenlik Denetimi
Uyari
Health Uyari
  • License — License: MIT
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Low visibility — Only 5 GitHub stars
Code Gecti
  • Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Gecti
  • Permissions — No dangerous permissions requested
Purpose
This tool is a local-first context compression pipeline designed to reduce token usage for Claude Code by up to 90%. It uses code parsing and hybrid search algorithms to analyze your codebase and extract only the relevant context before sending it to the AI.

Security Assessment
Overall Risk: Low
The automated code scan reviewed 12 files and found no dangerous patterns, hardcoded secrets, or requests for dangerous system permissions. The tool runs entirely locally, meaning it does not make external network requests or send your proprietary code to third-party servers. It relies on AST chunking and local SQLite databases to process data, keeping your codebase strictly on your machine.

Quality Assessment
The codebase is actively maintained, with the most recent push occurring today. It is properly licensed under the permissive and standard MIT license. However, the project currently suffers from extremely low community visibility and adoption, having accumulated only 5 GitHub stars. Because of this limited external scrutiny, potential bugs or edge-case vulnerabilities may not yet be identified by the wider developer community.

Verdict
Safe to use, though you should apply standard caution expected with early-stage, low-visibility community projects.
SUMMARY

⚡ Cut Claude token usage by 90%+ — free, open-source, local-first context compression for Claude Code. Hybrid RAG (BM25 + ONNX vectors), AST chunking, reranking. No API needed.

README.md

Token Reducer

Cut Claude API costs by 90%+ with intelligent context compression

Claude Code Plugin
License: MIT
Release
Python 3.11+
SQLite

The open-source alternative to expensive context management tools.

Easy InstallFeaturesDocumentationContributing


The Problem

Every time you use Claude with a large codebase, you're paying for thousands of tokens that aren't relevant to your query. Most context management tools either:

  • Send everything (expensive)
  • Truncate blindly (loses important context)
  • Require heavy Language Servers (slow, resource-intensive)

The Solution

Token Reducer is a local-first, intelligent context compression pipeline that:

  • Reduces tokens by 90-98% while preserving semantic relevance
  • Runs entirely locally — no API calls, no data leaving your machine
  • Works in milliseconds — faster than Language Server alternatives
  • Understands code semantically — AST parsing, not just text matching
┌─────────────────┐     ┌───────────────┐     ┌──────────────────┐
│  Your Codebase  │────▶│ Token Reducer │────▶│  Compressed      │
│  (50,000 tokens)│     │   Pipeline    │     │  Context (500t)  │
└─────────────────┘     └───────────────┘     └──────────────────┘
                              │
                    ┌─────────┴─────────┐
                    │  - AST Chunking   │
                    │  - BM25 + Vector  │
                    │  - TextRank       │
                    │  - Import Graph   │
                    │  - 2-Hop Symbols  │
                    └───────────────────┘

Easy Install

Option 1 — Claude Code /plugin Command (Recommended)

Step 1: Register the marketplace (one-time setup):

/plugin marketplace add Madhan230205/token-reducer

This registers the marketplace as Madhan230205-token-reducer.

Step 2: Install:

/plugin install token-reducer@Madhan230205-token-reducer

For project-scoped install:

/plugin install token-reducer@Madhan230205-token-reducer --scope project

Already ran Step 1 before? Just run /plugin install token-reducer@Madhan230205-token-reducer — no need to add the marketplace again.


Option 2 — Git Clone (Manual)

# 1. Clone into your Claude plugins folder
git clone https://github.com/Madhan230205/token-reducer.git ~/.claude/plugins/token-reducer

# 2. Install dependencies (optional but recommended for best results)
pip install -r ~/.claude/plugins/token-reducer/requirements-optional.txt

Windows users: Replace ~/.claude/plugins/ with %USERPROFILE%\.claude\plugins\

Then open ~/.claude/settings.json and add:

{
  "plugins": ["~/.claude/plugins/token-reducer"]
}

Restart Claude Code. Done.


What requirements-optional.txt installs:

Package Purpose
sentence-transformers Neural embeddings for smarter retrieval
hnswlib / faiss-cpu Fast approximate nearest-neighbor search
tree-sitter + language grammars AST-based code chunking (Python, JS, TS, Go, Rust, Java, C/C++, Ruby)

If you skip this step, Token Reducer still works using hash embeddings and regex chunking — no ML libraries required.


Option 3 — Zero-Dependency Quick Start

No pip, no ML libs — runs immediately after cloning:

git clone https://github.com/Madhan230205/token-reducer.git
cd token-reducer
python scripts/context_pipeline.py run \
  --inputs ./src \
  --query "Find auth logic" \
  --embedding-backend hash \
  --db .cache/index.db

Features

Core Pipeline

  • Hybrid Retrieval — BM25 + semantic vector search with intelligent fallback
  • AST-Based Chunking — Tree-sitter parsing for Python, TypeScript, Go, Rust, Java, and more
  • TextRank Compression — Graph-based sentence scoring for intelligent summarization
  • Sub-100ms Queries — SQLite FTS5 + HNSW indexes for instant results
  • Local-First — Everything runs on your machine, no external APIs

LSP-Killer Features

  • Import Graph — Automatically maps file dependencies without Language Server
  • 2-Hop Symbol Expansion — Auto "go-to-definition" for referenced functions
  • Diff Protocol — SEARCH/REPLACE edit format with automatic application
  • Semantic Clustering — Groups similar chunks to avoid redundancy

Enterprise Ready

  • Fully Configurable — 40+ tunable parameters in settings.json
  • Embedding Flexibility — ML models or hash fallback (zero dependencies)
  • Query Caching — Intelligent TTL-based caching for repeated queries
  • Session Memory — Tracks context across conversation turns

Documentation

How It Works

Query → FTS(BM25) → (Vector fallback if needed) → Merge → Top 5 → Compress

Full pipeline:

PREPROCESS → INDEX → RETRIEVE → RE-RANK → COMPRESS → CONTEXT PACKET

Basic Usage

# Index your codebase
python scripts/context_pipeline.py index --inputs ./src --db .cache/index.db

# Query with compression
python scripts/context_pipeline.py query \
  --query "How does authentication work?" \
  --db .cache/index.db \
  --json

# One-shot: index + query
python scripts/context_pipeline.py run \
  --inputs ./src \
  --query "Find the database connection logic" \
  --db .cache/index.db

Configuration

All settings in settings.json:

{
  "tokenReducer": {
    "chunkSizeWords": 220,
    "embeddingModel": "jinaai/jina-embeddings-v2-base-code",
    "hybridMode": "fallback",
    "astChunkingEnabled": true,
    "textRankEnabled": true,
    "lspFeatures": {
      "importGraphEnabled": true,
      "twoHopExpansionEnabled": true
    }
  }
}
Full Configuration Reference
Setting Default Description
chunkSizeWords 220 Target words per chunk
embeddingBackend "ml" "ml" for neural, "hash" for zero-dep
embeddingModel jina-v2-code Code-optimized embeddings
hybridMode "fallback" "fallback" or "always" for vector
astChunkingEnabled true Use tree-sitter AST parsing
textRankEnabled true Graph-based sentence scoring
importGraphEnabled true Track file dependencies
twoHopExpansionEnabled true Auto-expand referenced symbols
compressionWordBudget 350 Max words in compressed output

Zero-Dependency Mode

Run without any ML libraries:

python scripts/context_pipeline.py run \
  --inputs ./src \
  --query "Find auth logic" \
  --embedding-backend hash \
  --db .cache/index.db

Apply Code Edits

python scripts/apply_diff.py --input claude_response.txt --dir ./src
python scripts/apply_diff.py --input response.txt --dry-run

Architecture

Technology Stack

  • Storage: SQLite with FTS5 + custom embeddings table
  • Chunking: Tree-sitter AST parsing with regex fallback
  • Embeddings: Jina Code v2 (or zero-dependency hash embeddings)
  • ANN Search: HNSW via hnswlib (with FAISS fallback)
  • Compression: TextRank + query-relevance scoring

Repository Structure

token-reducer/
├── .claude-plugin/plugin.json
├── .mcp.json
├── .env.example
├── settings.json
├── requirements-optional.txt
├── scripts/
├── hooks/
├── commands/
├── agents/
├── skills/
└── evals/

Contributing

If anyone is interested in contributing, this project is open to contributions.
Please see contribute.md for contribution guidelines.

git clone https://github.com/Madhan230205/token-reducer.git
cd token-reducer
pip install -e ".[dev]"
python scripts/context_pipeline.py self-test

License

MIT License — see LICENSE for details.


Acknowledgments


Star this repo if Token Reducer saves you money!

Report BugRequest FeatureDiscussions

Yorumlar (0)

Sonuc bulunamadi