context-os

skill
Security Audit
Fail
Health Warn
  • License — License: MIT
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Low visibility — Only 6 GitHub stars
Code Fail
  • rm -rf — Recursive force deletion command in .github/workflows/ci.yml
Permissions Pass
  • Permissions — No dangerous permissions requested
Purpose
This is an installer that applies a curated set of token-optimization configurations to your Claude Code environment. It writes various settings files, slash commands, and Python/Rust hooks directly into your project to reduce context noise and API costs.

Security Assessment
Risk Rating: Medium. The installation method explicitly requires piping a remote bash script directly into your shell (`curl ... | bash`), which is a known security risk because it executes arbitrary code without prior inspection. Once installed, it embeds configuration files, including a `settings.json` permission auto-grant allowlist for reading, grepping, and running tests. The automated scan caught a recursive force deletion command (`rm -rf`) inside the CI workflow. While this is a common CI cache-cleaning practice, it warrants a manual code review. No hardcoded secrets or dangerous runtime permissions were detected.

Quality Assessment
The project is very new, evidenced by a low community footprint of only 6 stars, making it a low-visibility tool. However, it is actively maintained, with the most recent push occurring today. It uses the highly permissive MIT license and clearly documents every optimization technique it applies in its README. The codebase relies on minimal dependencies, requiring only Python 3 for its hooks.

Verdict
Use with caution — the tool is active and well-documented, but the `curl | bash` delivery method and newness of the project mean you should read the setup script thoroughly before running it in a sensitive environment.
SUMMARY

Every proven Claude Code token optimization in one curl command — response shaping, noise filtering, Haiku subagent, output compression hooks, session memory. Zero dependencies. Reversible. Works with Pro, Max, API.

README.md

Context OS

CI
License

Context OS is a curated set of token optimizations for Claude Code, packaged as an idempotent, reversible installer. It writes CLAUDE.md, .claudeignore, settings.json, slash commands, an output style, a statusLine, a Haiku subagent, and three zero-dependency Python hooks (dedup guard, loop guard, session profiler) into your project. Not a wrapper, not a proxy, no runtime dependency besides python3 for the hooks (with an optional Rust binary for two additional hook-based techniques).

curl -fsSL https://raw.githubusercontent.com/sravan27/context-os/main/setup.sh | bash

What it installs

Nineteen techniques, grouped by delivery mechanism. Evidence column is honest about where each number comes from.

# Technique Mechanism Evidence
1 Response shaping CLAUDE.md directives (drop preamble, recap, tool announcements) Third-party benchmark (caveman); ablation pending
2 Output style terse .claude/output-styles/terse.md invoked via /output-style terse Documented behavior
3 Noise filtering .claudeignore with 60+ patterns (node_modules, dist, .next, target) Measured per-repo via --measure; end-to-end in METHODOLOGY.md
4 Secret exclusion .claudeignore blocks .env, *.pem, credentials.json, SSH/AWS Documented behavior
5 Repo map + stack hints CLAUDE.md block generated from stack detection Ablation pending
6 Thinking budget cap MAX_THINKING_TOKENS=8000 in settings.json Documented env var
7 Early compaction CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=80 (default 95) Documented env var
8 Prompt caching 1h TTL ENABLE_PROMPT_CACHING_1H=1 Documented env var
9 Non-essential traffic off CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1 Documented env var
10 Context cap CLAUDE_CODE_MAX_CONTEXT_TOKENS=150000 Documented env var
11 Permission auto-grant settings.json allowlist for Read/Glob/Grep/git/test runners Documented behavior
12 statusLine .claude/statusline.sh (model · branch · context-os marker) Documented behavior
13 Slash commands /compact, /context, /ship, /cheap in .claude/commands/ Documented behavior
14 Haiku subagent .claude/agents/explorer.md delegates exploration to Haiku Model pricing ratio (Sonnet:Haiku); ablation pending
15 Dedup guard PreToolUse hook: blocks duplicate Read/Glob/Grep within 10min Smoke-tested in CI; session-profile reports list how many duplicates were caught
16 Loop guard PreToolUse hook: warns at 5 edits, blocks at 8 edits on same file per session Smoke-tested in CI; addresses a pattern called out in Claude Code best-practices
17 Session profiler Stop hook: writes per-session token breakdown to .context-os/session-reports/ — surfaces duplicate tool calls, edit loops, oversized results Deterministic transcript parser; no telemetry phones home
18 Output compression (Rust) PostToolUse hook wraps test/build output through typed reducers Measured on 50-test cargo fixture (see METHODOLOGY.md §4)
19 Session memory (Rust) PreCompact + Stop hooks write restart packet Measured on fail-edit-pass cycle (see METHODOLOGY.md §5)

Techniques 1–17 install via setup.sh (shell + Python stdlib only). Techniques 18–19 require the optional Rust binary.

What it doesn't do

  • No LLM routing, model swapping, or prompt rewriting.
  • No proxy in front of Claude Code. Claude Code talks to Anthropic directly.
  • No telemetry. No phone-home. No analytics. Read setup.sh.
  • No attempt to outguess Anthropic's defaults where defaults are reasonable.

Install

Per-project (recommended):

curl -fsSL https://raw.githubusercontent.com/sravan27/context-os/main/setup.sh | bash

Global (response shaping + env vars to ~/.claude/, applies to every project):

curl -fsSL https://raw.githubusercontent.com/sravan27/context-os/main/setup.sh | bash -s -- --global

With the Rust binary (adds techniques 18–19, output compression + session memory):

cargo install --git https://github.com/sravan27/context-os --path apps/cli
context-os init

Stack auto-detection covers: Node/TypeScript, Next.js, Python, Rust, Go, Flutter/Dart. Stack-specific hints are appended to CLAUDE.md; generic hints otherwise.

Uninstall

curl -fsSL https://raw.githubusercontent.com/sravan27/context-os/main/setup.sh | bash -s -- --uninstall

Removes only the <!-- context-os --> block from CLAUDE.md and files Context OS wrote. Pre-existing content is preserved. Idempotent.

Measure

Dry-run estimator (no writes, no install):

curl -fsSL https://raw.githubusercontent.com/sravan27/context-os/main/setup.sh | bash -s -- --measure

Scans the repo, counts source vs. noise files, and estimates per-session token savings from static config (noise filtering, thinking cap, response shaping, output compression). Output is an estimate from file counts, not a measurement of a live session.

Status check:

curl -fsSL https://raw.githubusercontent.com/sravan27/context-os/main/setup.sh | bash -s -- --status

Benchmarks

Methodology: docs/METHODOLOGY.md. Raw reports: python/evals/reports/.

End-to-end measurement on a 2-file fixture (/tmp/cos-bench-test: one README and one .js file) via scripts/benchmark.sh, running the identical prompt through claude --print before and after install:

Metric Before After Delta
Input tokens 5 4 −1
Cached reads 74,064 48,182 −25,882
Output tokens 466 294 −172
Total tokens 79,790 54,036 −32.3%
Cost (USD, Sonnet 4.6) $0.049 $0.040 −18.8%

This is a trivial fixture. It is the floor, not the claim. A 2-file repo has almost nothing to filter; the 32% reduction comes mostly from response shaping and the thinking cap. On real repos with node_modules, dist, lockfiles, and longer sessions, the noise filtering and prompt caching contributions grow substantially. Reproduce against any repo:

git clone https://github.com/sravan27/context-os && cd context-os
scripts/benchmark.sh /path/to/your/repo --model sonnet

Requires claude on PATH. Results written to /tmp/cos-last-benchmark.json.

Component-level measurements (see METHODOLOGY.md for each):

  • Output compression: 71% token reduction on 50-test cargo fixture; 48 passing tests collapsed to one line, 2 failures preserved verbatim.
  • Protected-string recall: 100% on the reducer test corpus (paths, errors, versions).
  • Concurrent writes: 5/5 PostToolUse writes captured under lockfile.
  • Compaction survival: decisions and rationale survive a fail-edit-pass cycle across compaction boundary.
  • Command pattern coverage: 42 test/build invocations matched including cd /x && RUST_BACKTRACE=1 cargo test.

Architecture

setup.sh is a single shell script that writes 14 config-only techniques plus 3 Python hooks. It detects stack, generates CLAUDE.md with a <!-- context-os --> block, writes .claudeignore, merges .claude/settings.json, drops slash commands / output style / statusLine / explorer subagent into .claude/, and installs .claude/hooks/{dedup_guard,loop_guard,session_profile}.py with a merged entry in .claude/settings.local.json.

The Python hooks are zero-dependency (stdlib only), fail-open on any error (never break a user session), and store per-session state under ~/.context-os/state/. Each hook is auditable — cat it and read 100 lines.

The optional Rust binary (apps/cli) installs two additional hooks wired in hooks.json:

  • PostToolUse (hooks.json:12) for test/build output compression via reducer-engine.
  • PreCompact (hooks.json:38) and Stop (hooks.json:51) for session memory handoff.

Rust crates: reducer-engine (typed output compression), session-memory (handoff writer), token-estimator, config, telemetry (local-only; writes to .context-os/, never leaves machine).

Manual techniques not automated but documented in CLAUDE.md: /clear between tasks, /btw for side questions, /compact [instructions], plan mode (Shift+Tab), specific prompting, @filename references, writer/reviewer split, explicit explorer-subagent delegation.

Contributing

To add a technique, open a PR with:

  1. The config change (or hook code).
  2. A test in python/evals/ or tests/ that exercises it.
  3. An entry in the Evidence column pointing to a measurement or documented behavior. "Ablation pending" is acceptable for new entries; "this seems like it should help" is not.

Tests:

cargo test
python3 python/evals/runners/safe_mode_runner.py
python3 python/evals/runners/compaction_survival_runner.py
scripts/benchmark.sh /tmp/cos-bench-test --model sonnet

Limitations

  • Does not bypass usage limits.
  • Response shaping effectiveness varies by task: 40–65% on explanation-heavy, 13–21% on structured code generation (third-party measurement; our ablation pending).
  • Hook-based techniques depend on Claude Code exposing PreToolUse, PostToolUse, PreCompact, SessionStart, Stop.
  • The <!-- context-os --> CLAUDE.md block costs ~12–15% input overhead per turn. Amortized across a session, it pays back in 1–2 turns on non-trivial repos.

Acknowledgments

License

MIT. See LICENSE.

For the Claude Code team

If you work on Claude Code at Anthropic: see docs/FOR-CLAUDE-CODE-TEAM.md for three findings and recommendations.

Reviews (0)

No results found