context-os
Health Warn
- License — License: MIT
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Low visibility — Only 6 GitHub stars
Code Fail
- rm -rf — Recursive force deletion command in .github/workflows/ci.yml
Permissions Pass
- Permissions — No dangerous permissions requested
This is an installer that applies a curated set of token-optimization configurations to your Claude Code environment. It writes various settings files, slash commands, and Python/Rust hooks directly into your project to reduce context noise and API costs.
Security Assessment
Risk Rating: Medium. The installation method explicitly requires piping a remote bash script directly into your shell (`curl ... | bash`), which is a known security risk because it executes arbitrary code without prior inspection. Once installed, it embeds configuration files, including a `settings.json` permission auto-grant allowlist for reading, grepping, and running tests. The automated scan caught a recursive force deletion command (`rm -rf`) inside the CI workflow. While this is a common CI cache-cleaning practice, it warrants a manual code review. No hardcoded secrets or dangerous runtime permissions were detected.
Quality Assessment
The project is very new, evidenced by a low community footprint of only 6 stars, making it a low-visibility tool. However, it is actively maintained, with the most recent push occurring today. It uses the highly permissive MIT license and clearly documents every optimization technique it applies in its README. The codebase relies on minimal dependencies, requiring only Python 3 for its hooks.
Verdict
Use with caution — the tool is active and well-documented, but the `curl | bash` delivery method and newness of the project mean you should read the setup script thoroughly before running it in a sensitive environment.
Every proven Claude Code token optimization in one curl command — response shaping, noise filtering, Haiku subagent, output compression hooks, session memory. Zero dependencies. Reversible. Works with Pro, Max, API.
Context OS
Context OS is a curated set of token optimizations for Claude Code, packaged as an idempotent, reversible installer. It writes CLAUDE.md, .claudeignore, settings.json, slash commands, an output style, a statusLine, a Haiku subagent, and three zero-dependency Python hooks (dedup guard, loop guard, session profiler) into your project. Not a wrapper, not a proxy, no runtime dependency besides python3 for the hooks (with an optional Rust binary for two additional hook-based techniques).
curl -fsSL https://raw.githubusercontent.com/sravan27/context-os/main/setup.sh | bash
What it installs
Nineteen techniques, grouped by delivery mechanism. Evidence column is honest about where each number comes from.
| # | Technique | Mechanism | Evidence |
|---|---|---|---|
| 1 | Response shaping | CLAUDE.md directives (drop preamble, recap, tool announcements) |
Third-party benchmark (caveman); ablation pending |
| 2 | Output style terse |
.claude/output-styles/terse.md invoked via /output-style terse |
Documented behavior |
| 3 | Noise filtering | .claudeignore with 60+ patterns (node_modules, dist, .next, target) |
Measured per-repo via --measure; end-to-end in METHODOLOGY.md |
| 4 | Secret exclusion | .claudeignore blocks .env, *.pem, credentials.json, SSH/AWS |
Documented behavior |
| 5 | Repo map + stack hints | CLAUDE.md block generated from stack detection |
Ablation pending |
| 6 | Thinking budget cap | MAX_THINKING_TOKENS=8000 in settings.json |
Documented env var |
| 7 | Early compaction | CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=80 (default 95) |
Documented env var |
| 8 | Prompt caching 1h TTL | ENABLE_PROMPT_CACHING_1H=1 |
Documented env var |
| 9 | Non-essential traffic off | CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1 |
Documented env var |
| 10 | Context cap | CLAUDE_CODE_MAX_CONTEXT_TOKENS=150000 |
Documented env var |
| 11 | Permission auto-grant | settings.json allowlist for Read/Glob/Grep/git/test runners |
Documented behavior |
| 12 | statusLine | .claude/statusline.sh (model · branch · context-os marker) |
Documented behavior |
| 13 | Slash commands | /compact, /context, /ship, /cheap in .claude/commands/ |
Documented behavior |
| 14 | Haiku subagent | .claude/agents/explorer.md delegates exploration to Haiku |
Model pricing ratio (Sonnet:Haiku); ablation pending |
| 15 | Dedup guard | PreToolUse hook: blocks duplicate Read/Glob/Grep within 10min |
Smoke-tested in CI; session-profile reports list how many duplicates were caught |
| 16 | Loop guard | PreToolUse hook: warns at 5 edits, blocks at 8 edits on same file per session | Smoke-tested in CI; addresses a pattern called out in Claude Code best-practices |
| 17 | Session profiler | Stop hook: writes per-session token breakdown to .context-os/session-reports/ — surfaces duplicate tool calls, edit loops, oversized results |
Deterministic transcript parser; no telemetry phones home |
| 18 | Output compression (Rust) | PostToolUse hook wraps test/build output through typed reducers | Measured on 50-test cargo fixture (see METHODOLOGY.md §4) |
| 19 | Session memory (Rust) | PreCompact + Stop hooks write restart packet | Measured on fail-edit-pass cycle (see METHODOLOGY.md §5) |
Techniques 1–17 install via setup.sh (shell + Python stdlib only). Techniques 18–19 require the optional Rust binary.
What it doesn't do
- No LLM routing, model swapping, or prompt rewriting.
- No proxy in front of Claude Code. Claude Code talks to Anthropic directly.
- No telemetry. No phone-home. No analytics. Read
setup.sh. - No attempt to outguess Anthropic's defaults where defaults are reasonable.
Install
Per-project (recommended):
curl -fsSL https://raw.githubusercontent.com/sravan27/context-os/main/setup.sh | bash
Global (response shaping + env vars to ~/.claude/, applies to every project):
curl -fsSL https://raw.githubusercontent.com/sravan27/context-os/main/setup.sh | bash -s -- --global
With the Rust binary (adds techniques 18–19, output compression + session memory):
cargo install --git https://github.com/sravan27/context-os --path apps/cli
context-os init
Stack auto-detection covers: Node/TypeScript, Next.js, Python, Rust, Go, Flutter/Dart. Stack-specific hints are appended to CLAUDE.md; generic hints otherwise.
Uninstall
curl -fsSL https://raw.githubusercontent.com/sravan27/context-os/main/setup.sh | bash -s -- --uninstall
Removes only the <!-- context-os --> block from CLAUDE.md and files Context OS wrote. Pre-existing content is preserved. Idempotent.
Measure
Dry-run estimator (no writes, no install):
curl -fsSL https://raw.githubusercontent.com/sravan27/context-os/main/setup.sh | bash -s -- --measure
Scans the repo, counts source vs. noise files, and estimates per-session token savings from static config (noise filtering, thinking cap, response shaping, output compression). Output is an estimate from file counts, not a measurement of a live session.
Status check:
curl -fsSL https://raw.githubusercontent.com/sravan27/context-os/main/setup.sh | bash -s -- --status
Benchmarks
Methodology: docs/METHODOLOGY.md. Raw reports: python/evals/reports/.
End-to-end measurement on a 2-file fixture (/tmp/cos-bench-test: one README and one .js file) via scripts/benchmark.sh, running the identical prompt through claude --print before and after install:
| Metric | Before | After | Delta |
|---|---|---|---|
| Input tokens | 5 | 4 | −1 |
| Cached reads | 74,064 | 48,182 | −25,882 |
| Output tokens | 466 | 294 | −172 |
| Total tokens | 79,790 | 54,036 | −32.3% |
| Cost (USD, Sonnet 4.6) | $0.049 | $0.040 | −18.8% |
This is a trivial fixture. It is the floor, not the claim. A 2-file repo has almost nothing to filter; the 32% reduction comes mostly from response shaping and the thinking cap. On real repos with node_modules, dist, lockfiles, and longer sessions, the noise filtering and prompt caching contributions grow substantially. Reproduce against any repo:
git clone https://github.com/sravan27/context-os && cd context-os
scripts/benchmark.sh /path/to/your/repo --model sonnet
Requires claude on PATH. Results written to /tmp/cos-last-benchmark.json.
Component-level measurements (see METHODOLOGY.md for each):
- Output compression: 71% token reduction on 50-test cargo fixture; 48 passing tests collapsed to one line, 2 failures preserved verbatim.
- Protected-string recall: 100% on the reducer test corpus (paths, errors, versions).
- Concurrent writes: 5/5 PostToolUse writes captured under lockfile.
- Compaction survival: decisions and rationale survive a fail-edit-pass cycle across compaction boundary.
- Command pattern coverage: 42 test/build invocations matched including
cd /x && RUST_BACKTRACE=1 cargo test.
Architecture
setup.sh is a single shell script that writes 14 config-only techniques plus 3 Python hooks. It detects stack, generates CLAUDE.md with a <!-- context-os --> block, writes .claudeignore, merges .claude/settings.json, drops slash commands / output style / statusLine / explorer subagent into .claude/, and installs .claude/hooks/{dedup_guard,loop_guard,session_profile}.py with a merged entry in .claude/settings.local.json.
The Python hooks are zero-dependency (stdlib only), fail-open on any error (never break a user session), and store per-session state under ~/.context-os/state/. Each hook is auditable — cat it and read 100 lines.
The optional Rust binary (apps/cli) installs two additional hooks wired in hooks.json:
PostToolUse(hooks.json:12) for test/build output compression via reducer-engine.PreCompact(hooks.json:38) andStop(hooks.json:51) for session memory handoff.
Rust crates: reducer-engine (typed output compression), session-memory (handoff writer), token-estimator, config, telemetry (local-only; writes to .context-os/, never leaves machine).
Manual techniques not automated but documented in CLAUDE.md: /clear between tasks, /btw for side questions, /compact [instructions], plan mode (Shift+Tab), specific prompting, @filename references, writer/reviewer split, explicit explorer-subagent delegation.
Contributing
To add a technique, open a PR with:
- The config change (or hook code).
- A test in
python/evals/ortests/that exercises it. - An entry in the Evidence column pointing to a measurement or documented behavior. "Ablation pending" is acceptable for new entries; "this seems like it should help" is not.
Tests:
cargo test
python3 python/evals/runners/safe_mode_runner.py
python3 python/evals/runners/compaction_survival_runner.py
scripts/benchmark.sh /tmp/cos-bench-test --model sonnet
Limitations
- Does not bypass usage limits.
- Response shaping effectiveness varies by task: 40–65% on explanation-heavy, 13–21% on structured code generation (third-party measurement; our ablation pending).
- Hook-based techniques depend on Claude Code exposing PreToolUse, PostToolUse, PreCompact, SessionStart, Stop.
- The
<!-- context-os -->CLAUDE.md block costs ~12–15% input overhead per turn. Amortized across a session, it pays back in 1–2 turns on non-trivial repos.
Acknowledgments
- JuliusBrussee/caveman — response shaping benchmark.
- Anthropic Claude Code documentation — env vars, hooks, output styles, subagents.
License
MIT. See LICENSE.
For the Claude Code team
If you work on Claude Code at Anthropic: see docs/FOR-CLAUDE-CODE-TEAM.md for three findings and recommendations.
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found