graymatter
Health Warn
- License — License: MIT
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Low visibility — Only 5 GitHub stars
Code Pass
- Code scan — Scanned 5 files during light audit, no dangerous patterns found
Permissions Pass
- Permissions — No dangerous permissions requested
This tool provides an embeddable, offline memory layer for AI agents written in Go. It allows developers to store and retrieve persistent memories using a simple three-function API, aiming to drastically reduce token usage without requiring external databases or servers.
Security Assessment
Overall risk: Low. The static code scan evaluated 5 files and found no dangerous patterns or hardcoded secrets, and the tool does not request any dangerous system permissions. It is designed to work offline and explicitly states that zero infrastructure, API keys, or accounts are required for storage, which means it should not make external network requests. However, users should be aware that the project relies on piping web requests directly into shell commands for its binary installation method, which is a common but inherently risky practice if the repository were ever compromised.
Quality Assessment
This is a very new and early-stage project with low community visibility, currently sitting at only 5 GitHub stars. That said, the underlying code quality appears solid. It is actively maintained (last pushed today), covered by a continuous integration pipeline, has good test coverage (73.5%), and uses the permissive MIT license. Developers should feel confident in the code's cleanliness but must accept the risks of relying on a single-maintainer library.
Verdict
Use with caution: the code is clean, safe, and well-structured, but the project's extremely low community adoption means it lacks the vetting and longevity typically expected for production environments.
Three lines of code to give your AI agents persistent memory. Reduce 90% token consumption while also maintaining quality.
GrayMatter
Single Go binary. Zero infra. Works with Claude Code, or any tool that calls the Anthropic Messages API.
Free, offline, no account required.
go get github.com/angelnicolasc/graymatter
mem := graymatter.New(".graymatter")
mem.Remember("agent", "user prefers bullet points, hates long intros")
ctx := mem.Recall("agent", "how should I format this response?")
// ["user prefers bullet points, hates long intros"]
Why
Every AI agent today is stateless by default. Every run starts from zero.
Mem0, Zep, Supermemory solve this — but in Python or TypeScript, and they
require a server. Go has zero production-ready, embeddable, zero-deps
memory layer for agents. That gap is GrayMatter.
~90% token reduction at 100 sessions versus full-history injection.
No Docker. No Redis. No Python. No API key required for storage.
Install
Binary (recommended):
# macOS (Apple Silicon)
curl -sSL -o graymatter.tar.gz https://github.com/angelnicolasc/graymatter/releases/download/v0.2.1/graymatter_0.2.1_darwin_arm64.tar.gz
tar -xzf graymatter.tar.gz
sudo mv graymatter /usr/local/bin/
# Windows (PowerShell)
iwr https://github.com/angelnicolasc/graymatter/releases/download/v0.2.1/graymatter_0.2.1_windows_amd64.zip -OutFile graymatter.zip
Expand-Archive graymatter.zip -DestinationPath .\graymatter_cli
Go install:
go install github.com/angelnicolasc/graymatter/cmd/graymatter@latest
Library:
go get github.com/angelnicolasc/graymatter
Library usage
Three functions. That's the entire API surface.
import "github.com/angelnicolasc/graymatter"
// Open (or create) a memory store in the given directory.
mem := graymatter.New(".graymatter")
defer mem.Close()
// Store an observation.
mem.Remember("sales-closer", "Maria didn't reply Wednesday. Third touchpoint due Friday.")
// Retrieve relevant context for a query.
ctx := mem.Recall("sales-closer", "follow up Maria")
// ctx is a []string ready to inject into a system prompt:
// ["Maria didn't reply Wednesday. Third touchpoint due Friday."]
Every method has a context-aware variant that respects deadlines and cancellation signals end-to-end — no wrappers needed:
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()
if err := mem.RememberCtx(ctx, "agent", "observation"); err != nil { ... }
results, err := mem.RecallCtx(ctx, "agent", "query")
Full agent pattern
mem := graymatter.New(project.Root + "/.graymatter")
defer mem.Close()
// 1. Recall before calling the LLM.
memCtx, _ := mem.Recall(skill.Name, task.Description)
messages := []anthropic.MessageParam{
{Role: "system", Content: skill.Identity + "\n\n## Memory\n" + strings.Join(memCtx, "\n")},
{Role: "user", Content: task.Description},
}
// 2. Call your LLM.
response, _ := client.Messages.New(ctx, anthropic.MessageNewParams{...})
// 3. Remember after the run.
mem.Remember(skill.Name, extractKeyFacts(response))
Config
mem, err := graymatter.NewWithConfig(graymatter.Config{
DataDir: ".graymatter",
TopK: 8,
EmbeddingMode: graymatter.EmbeddingAuto, // Ollama → OpenAI → Anthropic → keyword
OllamaURL: "http://localhost:11434",
OllamaModel: "nomic-embed-text",
AnthropicAPIKey: os.Getenv("ANTHROPIC_API_KEY"),
OpenAIAPIKey: os.Getenv("OPENAI_API_KEY"),
DecayHalfLife: 30 * 24 * time.Hour, // 30 days
AsyncConsolidate: true,
})
CLI
graymatter init # create .graymatter/ + .mcp.json
graymatter remember "agent" "text to remember" # store a fact
graymatter remember --shared "text" # store in shared namespace (all agents)
graymatter recall "agent" "query" # print context
graymatter recall --all "agent" "query" # merge agent + shared memory
graymatter checkpoint list "agent" # show saved checkpoints
graymatter checkpoint resume "agent" # print latest checkpoint as JSON
graymatter mcp serve # start MCP server (Claude Code / Cursor)
graymatter mcp serve --http :8080 # HTTP transport
graymatter export --format obsidian --out ~/vault # dump to Obsidian vault
graymatter tui # 4-view terminal UI
graymatter run agent.md [--background] # run a SKILL.md agent file
graymatter sessions list # list managed agent sessions
graymatter plugin install manifest.json # install a plugin
graymatter server --addr :8080 # REST API server
Global flags: --dir (data dir), --quiet, --json
Observability
The REST server (graymatter server) exposes a /metrics endpoint powered by Go's standard expvar package — zero extra dependencies.
GET /metrics
{
"requests_total": {"remember": 120, "recall": 340, "healthz": 5},
"request_latency_us": {"remember": 4200, "recall": 1800},
"facts_total": {"stored": 120},
"recall_total": {"served": 340}
}
For library users, memory.StoreConfig exposes hooks for APM integration:
store, err := memory.Open(memory.StoreConfig{
DataDir: ".graymatter",
DecayHalfLife: 30 * 24 * time.Hour,
// Called after every Recall with agent ID, query, result count, and latency.
OnRecall: func(agentID, query string, n int, d time.Duration) {
metrics.RecordHistogram("graymatter.recall.latency", d.Seconds())
},
// Called after every successful Put with agent ID, fact ID, and latency.
OnPut: func(agentID, factID string, d time.Duration) {
metrics.Increment("graymatter.facts.stored")
},
// Routes internal log events to any standard logger.
Logger: slog.NewLogLogger(slog.Default().Handler(), slog.LevelDebug),
// Swap the vector backend entirely — bring your own Qdrant, pgvector, etc.
VectorBackend: myQdrantAdapter,
})
Claude Code / Cursor (MCP)
graymatter init # creates .mcp.json automatically
Claude Code detects .mcp.json automatically. Five tools become available:
| Tool | What it does |
|---|---|
memory_search |
Recall facts for a query |
memory_add |
Store a new fact |
checkpoint_save |
Snapshot current session |
checkpoint_resume |
Restore last checkpoint |
memory_reflect |
Add / update / forget / link memories (agent self-edit) |
Or add manually to your project's .mcp.json:
{
"mcpServers": {
"graymatter": {
"command": "graymatter",
"args": ["mcp", "serve"]
}
}
}
Storage
| Layer | Tech | What it holds |
|---|---|---|
| KV store | bbolt (pure Go, ACID) | Sessions, checkpoints, facts, metadata, KG |
| Vector index | chromem-go (pure Go) | Semantic embeddings, hybrid retrieval |
| Export | Markdown files | Human-readable, git-friendly, Obsidian-compatible |
Single file: ~/.graymatter/gray.db
Single folder: .graymatter/vectors/
No migrations. No schema versions. Append-only with decay-based eviction.
Embeddings
GrayMatter degrades gracefully. It works without any embedding model.
| Mode | When |
|---|---|
| Ollama (default) | Machine has Ollama running with nomic-embed-text |
| OpenAI | OPENAI_API_KEY set, Ollama not available |
| Anthropic | ANTHROPIC_API_KEY set, Ollama and OpenAI not available |
| Keyword-only | No embedding available — TF-IDF + recency, zero deps |
Auto-detection order in EmbeddingAuto mode: Ollama → OpenAI → Anthropic → keyword.
# Pull the embedding model once (Ollama):
ollama pull nomic-embed-text
# Or set an API key (OpenAI or Anthropic):
export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-ant-...
Memory lifecycle
Recall(agent, task) ← hybrid: vector + keyword + recency → top-8 facts
↓
Inject into system prompt ← your 3 lines of code
↓
Agent runs
↓
Remember(agent, observation) ← store key facts during/after run
↓
Consolidate() [async] ← summarise + decay + prune (LLM optional)
Consolidation is the only "smart" step. Everything else is deterministic.
Without consolidation, GrayMatter still works — it just doesn't compress over time.
Consolidation auto-enables when ANTHROPIC_API_KEY is set. To use Ollama:
cfg := graymatter.DefaultConfig()
cfg.ConsolidateLLM = "ollama"
Token efficiency
Numbers produced by go run ./benchmarks/token_count — real Recall calls,
keyword embedder, no LLM required:
| Sessions | Full injection | GrayMatter | Reduction |
|---|---|---|---|
| 1 | ~80 tokens | ~80 tokens | 0% |
| 10 | ~630 tokens | ~550 tokens | 12% |
| 30 | ~1,880 tokens | ~550 tokens | 71% |
| 100 | ~6,960 tokens | ~670 tokens | 90% |
Each "session" = one paragraph-length agent observation (~60 words).
GrayMatter always injects only the top-8 most relevant observations for the query.
With vector embeddings the recall precision improves, maintaining similar reduction ratios.
Reproduce locally:
go run ./benchmarks/token_count
Build from source
git clone https://github.com/angelnicolasc/graymatter
cd graymatter
CGO_ENABLED=0 go build -ldflags="-s -w -X main.version=dev" -o graymatter ./cmd/graymatter
Output: single static binary, ~10 MB, no runtime dependencies.
Testing
The full test suite requires no LLM and no network — every test usest.TempDir() with a keyword embedder or injected stubs. Runs clean on
Linux, macOS, and Windows in CI.
# Core library
go test -count=1 -timeout=120s ./pkg/memory/...
# CLI / server / plugins
cd cmd/graymatter && go test -count=1 -timeout=120s ./internal/...
| Package | Tests | What's covered |
|---|---|---|
pkg/memory |
42 unit tests + 3 fuzz targets | Store lifecycle, hybrid recall, RRF fusion, decay math, semaphore, concurrent writes, vector paths, dimension guard |
internal/harness |
21 | Agent file parsing, retry/backoff, session recovery |
internal/kg |
21 | Graph CRUD, entity extraction, weight decay, Obsidian export |
internal/server |
11 | All REST endpoints, concurrent remember/recall, cancelled-context requests |
internal/plugin |
10 | Install, list, remove, E2E echo plugin binary |
Fuzz targets (pkg/memory): FuzzTokenize, FuzzUnmarshalFact, FuzzKeywordScore — each with a seeded corpus so they run deterministically in CI and can be extended with go test -fuzz.
Core library coverage: 73.5% (CI gate: ≥ 70%). Measured without mocks — real bbolt + chromem-go instances in a temp directory.
Token-reduction benchmark (also zero deps):
go run ./benchmarks/token_count
What GrayMatter is NOT
- Not a framework. Not an agent runner. Not a replacement for your existing tooling.
- Not a hosted service. Not a SaaS. Not a cloud product.
- Not a knowledge base UI. Not Notion. Not Obsidian.
- Not trying to win the enterprise memory market.
It is exactly one thing: the missing stateful layer for Go CLI agents,
packaged as a library you import in two lines.
Roadmap
- Library:
Remember/Recall/Consolidate - bbolt + chromem-go storage
- Ollama + OpenAI + Anthropic + keyword-only embedding
- Hybrid retrieval (vector + keyword + recency, RRF fusion)
- CLI:
init remember recall checkpoint export run sessions plugin server - MCP server (Claude Code / Cursor) +
memory_reflectself-edit tool - Knowledge graph (entity extraction, node/edge linking, Obsidian export)
- Shared memory across agents (
--shared,--allflags,__shared__namespace) - REST API server mode (
graymatter server --addr :8080) - Plugin system (JSON line protocol,
graymatter plugin install/list/remove) - 4-view Bubble Tea TUI (Memory / Sessions / Knowledge Graph / Stats)
- Context-propagation API (
RememberCtx,RecallCtx,RecallAllCtx, …) - Pluggable
VectorStoreinterface (swap chromem-go for Qdrant, pgvector, etc.) - expvar
/metricsendpoint — zero-dep, stdlib-only observability -
OnRecall/OnPut/Loggerhooks for APM integration - Embedding dimension guard — warns on provider switch instead of silent corruption
- go.work workspace — core library imports zero TUI/CLI dependencies
- Three-platform CI (Linux, macOS, Windows) + 73.5% coverage gate
- Fuzz testing:
FuzzTokenize,FuzzUnmarshalFact,FuzzKeywordScore - Ollama-backed consolidation LLM (Ollama as summariser, not just embedder)
- WebSocket streaming for REST API
GrayMatter — v0.2.1 — April 2026
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found