engram

mcp
Security Audit
Warn
Health Warn
  • License — License: NOASSERTION
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Low visibility — Only 6 GitHub stars
Code Warn
  • process.env — Environment variable access in plugins/claude-code/hooks/sessionstart.mjs
Permissions Pass
  • Permissions — No dangerous permissions requested
Purpose
This tool runs as a local daemon and compresses context data (such as system prompts, project instructions, and conversation history) before sending it to LLMs. Its goal is to reduce redundant tokens, saving API costs and freeing up context window space.

Security Assessment
Overall Risk: Low
The tool inherently interacts with sensitive data because it intercepts and processes your project code, instructions, and LLM conversation history. It runs locally and does not request inherently dangerous system permissions. However, there is a minor code flag: the Claude Code integration hook accesses environment variables (`process.env`). Because it acts as an intermediary for your prompts, you should verify that your compressed context and data are not being routed to any unexpected external servers.

Quality Assessment
The project is new and lacks widespread community validation, evident from its low star count. However, it appears to be actively maintained, with repository updates pushed as recently as today. The code is open-source, though the exact license is currently marked as unasserted, meaning you should verify the included license file before adopting it in a commercial enterprise.

Verdict
Use with caution: the underlying concept is useful and appears safe, but the lack of community maturity and an unasserted license warrant a manual code review before relying on it in sensitive workflows.
SUMMARY

Local-first context compression for AI coding tools. One binary saves 85-93% of redundant tokens across every LLM call.

README.md

Engram

Local-first context compression for AI coding tools. One binary saves 85-93% of redundant tokens across every LLM call.


What is Engram?

Every time an AI coding tool sends a request to an LLM, it re-sends the same context: who you are, what you're working on, your preferences, your project structure. This redundancy costs real money and eats into context windows.

Engram eliminates it. It runs locally as a lightweight daemon and compresses both your identity (CLAUDE.md, system prompts, project instructions) and your conversation context (message history, tool results, responses) across every LLM call. The result: dramatically smaller prompts, lower costs, and more room in the context window for what actually matters.

How It Works

Engram applies three compression stages:

  1. Identity compression — Verbose CLAUDE.md prose and project instructions are reduced to compact key=value codebook entries. Definitions are sent once on the first turn; subsequent turns reference keys only.
  2. Context compression — Conversation history is serialized using a learned codebook that strips JSON overhead from message objects (role=user content=... instead of full JSON).
  3. Response compression — LLM responses are compressed using provider-specific codebooks tuned to Anthropic and OpenAI output patterns.

Key Numbers

Metric Value
Identity compression ~96-98% token reduction
Context compression ~40-60% token reduction
Overall savings 85-93% per session
Startup overhead <50ms
Memory footprint ~30MB resident

Quick Start

# Install via Homebrew (macOS/Linux)
brew install pythondatascrape/tap/engram

# Or download a release binary
curl -fsSL https://github.com/pythondatascrape/engram/releases/latest/download/engram_$(uname -s | tr A-Z a-z)_$(uname -m | sed 's/x86_64/amd64/').tar.gz | tar xz
sudo mv engram /usr/local/bin/

# Or install from source
go install github.com/pythondatascrape/engram/cmd/engram@latest

# Set up Engram for your project
cd your-project
engram install

# See what Engram found
engram analyze

# Start the compression daemon
engram serve

CLI Reference

Command Description
engram install Interactive setup — detects your tools, configures integration
engram analyze Analyze your project and show compression opportunities
engram advisor Show optimization recommendations based on session data
engram serve Start the compression daemon
engram status Show daemon status, active sessions, and savings

Every command supports --help for detailed usage.

Integrations

Engram works as a plugin for AI coding tools:

Claude Code

engram install
# Engram auto-detects Claude Code and registers as an MCP plugin

Once installed, Engram compresses context automatically — no workflow changes needed.

OpenClaw

engram install
# Engram auto-detects OpenClaw and configures the integration

SDKs

For custom integrations, Engram provides thin client SDKs in three languages. All connect to the local daemon over a Unix socket.

Python:

from engram import Engram

async with await Engram.connect() as client:
    result = await client.compress({"identity": "...", "history": [], "query": "..."})

Go:

client, _ := engram.Connect(ctx, "")
defer client.Close()
result, _ := client.Compress(ctx, map[string]any{...})

Node.js:

import { Engram } from "engram";

const client = await Engram.connect();
const result = await client.compress({identity: "...", history: [], query: "..."});

See the Integration Guide for details.

Demo

See the Travelbound demo project for a working example that shows Engram compressing a real project's context from ~4,200 tokens to ~380 tokens.

Documentation

License

Apache 2.0 — see LICENSE.

Reviews (0)

No results found