token-saver

agent
Guvenlik Denetimi
Uyari
Health Uyari
  • License Ò€” License: NOASSERTION
  • Description Ò€” Repository has a description
  • Active repo Ò€” Last push 0 days ago
  • Low visibility Ò€” Only 6 GitHub stars
Code Uyari
  • Code scan incomplete Ò€” No supported source files were scanned during light audit
Permissions Gecti
  • Permissions Ò€” No dangerous permissions requested

Bu listing icin henuz AI raporu yok.

SUMMARY

🌢️ OpenClaw skill that reduces token consumption by 30-60% through context compression and smart optimization

README.md

⚑ Token Saver

Quality-aware token efficiency for AI agents

Diagnose waste. Optimize safely. Prove that each successful task costs less.

License: MIT
Project status
OpenClaw Skill
Contributions welcome

Quick start Β· Why Token Saver Β· Roadmap Β· Architecture Β· Benchmarks Β· Contributing

English Β· δΈ­ζ–‡θ―΄ζ˜Ž


Token Saver is an open-source project for reducing avoidable LLM and AI-agent cost without hiding quality regressions behind a compression percentage.

Most token tools focus on one layer: command-output compression, usage dashboards, model routing, or code retrieval. Token Saver is being built as a quality-aware efficiency control layer that connects four steps:

Observe waste β†’ Optimize safely β†’ Verify task quality β†’ Reconcile real cost

The current release is a lightweight OpenClaw skill. The next product milestone is Token Saver Doctor, a local-first CLI that identifies repeated context, cache-breaking prompt drift, oversized tool schemas, noisy logs, and other "ghost token" patterns.

Why Token Saver

Token waste is not a single compression problem.

Waste source Typical symptom Safer response
Repeated context The same files, instructions, or history are sent again Deduplicate or send only the delta
Prompt-cache misses Stable instructions move or change between requests Normalize and stabilize the prompt prefix
Tool-schema bloat Dozens of unused MCP tools enter every request Lazy-load only relevant tools
Noisy tool output Logs, test output, JSON, and file trees dominate context Apply structure-aware, reversible compaction
Over-compression The agent loses detail and repeats tools or rereads files Detect rework and fail open to the original
Wrong model or effort Routine steps use an unnecessarily expensive model Route by task risk, not prompt length alone
Invisible billing drift Local estimates do not match provider usage Reconcile against provider-reported usage

The differentiator: cost per successful task

A smaller prompt is not automatically a cheaper task. If compression removes a required detail and the agent reruns a tool, rereads a file, or produces a wrong answer, the apparent saving disappears.

Token Saver's target metric is therefore:

Cost per successful task
= total provider cost, including retries and rework
  Γ· successfully completed tasks

The long-term product is designed around three connected layers:

  1. Doctor β€” find where tokens are being wasted.
  2. Gateway β€” apply safe, reversible optimizations with fail-open behavior.
  3. Proof Ledger β€” compare original input, optimized input, provider usage, retries, latency, and task outcome.

Project status

[!IMPORTANT]
Token Saver is currently an early-stage OpenClaw skill, not yet a universal proxy or production cost-control platform.

Available now

  • Task-complexity guidance for model selection
  • Context hygiene for long conversations
  • Concise-response rules
  • Tool-call and file-read discipline
  • OpenClaw-compatible SKILL.md

In development

  • token-saver doctor for local transcript and configuration analysis
  • A unified usage ledger backed by provider receipts
  • Prompt-prefix and cache-hit diagnostics
  • Repeated-context and repeated-file-read detection
  • Reversible log and tool-output compaction
  • Quality-aware replay and regression benchmarks
  • Adapters for Claude Code, Codex, OpenCode, OpenClaw, Hermes, and other agents

See the full roadmap.

Quick start

Install as an OpenClaw skill

openclaw skills install token-saver

Or install manually:

mkdir -p ~/.openclaw/workspace/skills/token-saver
cp SKILL.md ~/.openclaw/workspace/skills/token-saver/SKILL.md

The skill acts as an advisory policy for every turn. It encourages the agent to choose an appropriate model, avoid repeated reads, compress resolved context, batch tool calls, and keep output proportional to the task.

Actual savings depend on the model, provider, agent, task mix, cache behavior, and whether the host supports model switching. Token Saver does not present estimated percentages as measured results.

What makes this different

Product category Usually measures Usually misses Token Saver direction
Output compressor Tokens before vs. after compression Retries, lost detail, task failure Compression plus rework detection and replay
Token dashboard Historical usage and estimated cost Automatic remediation Diagnosis linked to executable fixes
Model router Price per request Task quality and downstream rework Risk-aware routing with outcome tracking
Code index / MCP Fewer full-file reads Logs, chat history, billing, non-code workflows Cross-layer waste analysis
Prompt skill Better agent behavior Real request interception and provider receipts Skill today; measurable runtime layer next

Planned architecture

Claude Code / Codex / OpenCode / OpenClaw / Hermes / custom agents
                              β”‚
                         Agent adapters
                              β”‚
                 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                 β”‚      Token Saver        β”‚
                 β”‚                         β”‚
                 β”‚  Meter    Optimizer     β”‚
                 β”‚  Doctor   Quality Guard β”‚
                 β”‚  Cache    Replay        β”‚
                 β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”‚
                    Local Proof Ledger
                              β”‚
                         LLM provider

Design principles:

  • Local-first β€” session analysis and optimization metadata stay on the user's machine by default.
  • Fail-open β€” unknown formats or optimization errors forward the original request.
  • Reversible where possible β€” compressed content keeps a path back to the original.
  • Provider receipts win β€” provider-reported usage is the accounting source of truth.
  • Quality before compression ratio β€” no optimization is successful if task quality falls.
  • Adapter-based β€” agent-specific integrations remain separate from the core event model.

Read the architecture document.

Benchmark policy

Token Saver will publish claims only when they are reproducible. Every benchmark must report:

  • fixed model and task set;
  • baseline and optimized runs;
  • original tokens, optimized tokens, and provider-reported usage;
  • task success rate;
  • retries and repeated tool calls;
  • wall-clock time;
  • cost per successful task.

Initial benchmark tracks are defined in docs/BENCHMARKS.md:

  • codebase exploration;
  • bug fixing with executable tests;
  • long-running document conversations;
  • CI and build-log diagnosis;
  • support-ticket history analysis.

Roadmap snapshot

Phase Deliverable Status
0 OpenClaw token-efficiency skill Available
1 Local token-saver doctor and waste taxonomy Next
2 Usage ledger and provider reconciliation Planned
3 Safe gateway: deduplication, cache alignment, reversible compaction Planned
4 Quality replay, regression detection, and cost-per-success dashboards Planned
5 Self-hosted CI, batch, audit, and enterprise controls Exploring

Target integrations

Integration Current Planned role
OpenClaw βœ… Skill and runtime adapter
Claude Code ◻️ Transcript analysis, hooks, gateway adapter
OpenAI Codex ◻️ Session analysis and Responses API adapter
OpenCode ◻️ Usage ingestion and runtime adapter
Hermes Agent ◻️ Plugin and session analysis
Cursor / Cline / Roo Code ◻️ Proxy and telemetry adapters
MCP clients ◻️ Tool-schema diagnosis and lazy loading

Repository structure

.
β”œβ”€β”€ SKILL.md                  # Current OpenClaw skill
β”œβ”€β”€ README.md                 # Canonical English documentation
β”œβ”€β”€ README_CN.md              # Short Chinese overview
β”œβ”€β”€ ROADMAP.md                # Product milestones
β”œβ”€β”€ CONTRIBUTING.md           # Contribution guide
β”œβ”€β”€ llms.txt                  # AI-readable project index
β”œβ”€β”€ docs/
β”‚   β”œβ”€β”€ ARCHITECTURE.md       # Planned system design
β”‚   └── BENCHMARKS.md         # Reproducible evaluation policy
└── references/
    └── token-pricing.md      # Background cost notes

Contributing

The most useful contributions now are:

  • anonymized token-waste patterns from real agent sessions;
  • parsers for Claude Code, Codex, OpenCode, OpenClaw, or Hermes logs;
  • reproducible benchmark tasks;
  • safe, deterministic compaction recipes;
  • provider usage and cache-accounting tests;
  • documentation and installation feedback.

Read CONTRIBUTING.md before opening a pull request.

Search keywords

AI agent token optimization Β· LLM cost optimization Β· context engineering Β· context compression Β· prompt caching Β· token usage Β· AI FinOps Β· MCP optimization Β· Claude Code Β· OpenAI Codex Β· OpenClaw Β· Hermes Agent Β· Cursor Β· RAG optimization Β· local-first AI Β· reversible compression Β· agent observability

License

MIT β€” see LICENSE.

Yorumlar (0)

Sonuc bulunamadi