MemoryPilot
Health Gecti
- License — License: NOASSERTION
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Community trust — 12 GitHub stars
Code Basarisiz
- child_process — Shell command execution capability in push_transcript.js
- fs module — File system access in push_transcript.js
Permissions Gecti
- Permissions — No dangerous permissions requested
This MCP server provides AI coding assistants with persistent, searchable memory across sessions. It uses a Temporal Knowledge Graph and hybrid search to organize project knowledge while compressing context to save on API token costs.
Security Assessment
Overall Risk: Medium. The tool claims to be a pure Rust application with zero dependencies and a single binary, but rule-based scans flagged a JavaScript file (`push_transcript.js`) containing file system access and shell command execution capabilities. This discrepancy warrants caution. No hardcoded secrets were detected, and the project does not request inherently dangerous permissions. You should manually verify how this script interacts with the rest of the application and whether it introduces unintended vulnerabilities.
Quality Assessment
The project is very new and actively maintained, with repository activity as recent as today. However, community trust is currently minimal, reflected by only 12 GitHub stars. The licensing is explicitly labeled as "Source Available" in the documentation but registered as NOASSERTION by automated scans, meaning it may not be truly open-source and could contain usage restrictions.
Verdict
Use with caution — verify the purpose and safety of the included JavaScript files before integrating this into your workflow.
The most advanced AI memory server in the world. Hybrid search, Temporal Knowledge Graph, transformer embeddings, AAAK compression (3x token savings) — pure Rust, single binary, zero dependencies.
The most advanced MCP memory server. Period.
Hybrid search (BM25 + multilingual-e5-small RRF) · 100+ languages · Temporal Knowledge Graph · AAAK compression (3x token savings) · GraphRAG · Chunked RAG · Auto-Linting · Project brain · HTTP API · Single binary · Zero API calls
Why
AI coding assistants forget everything between sessions. MemoryPilot gives them persistent, searchable memory with project awareness, semantic understanding, and automatic knowledge organization. Built-in AAAK compression reduces token consumption by 3x when loading context, saving you money on every API call.
Benchmarks
Search Quality — Real-World (500 memories, 30 scenarios)
| Metric | MemoryPilot v4.0 | MemPalace v3.1 (raw) | Quantum Memory Graph |
|---|---|---|---|
| R@5 | 100% | 96.6%¹ | 93.4% |
| R@10 | 100% | N/A | 93.4% |
| NDCG@10 | 95.6% | 88.9%¹ | 90.8% |
| Cluster Coherence | 96.7% | N/A | N/A |
| Multilingual | 100+ languages | English only | English only |
| AAAK Compression | 3x (no recall loss) | 30x (recall drops to 84.2%) | N/A |
| Avg Search Latency | ~69 ms | N/A | ~80 ms |
| Binary Size | 22 MB | ~500 MB (Python+ChromaDB) | 1.5 GB |
| Dependencies | 0 (single binary) | Python + ChromaDB + SQLite | Python + ONNX |
¹ MemPalace's 96.6% R@5 is measured on LongMemEval-s (~50 sessions per haystack, session-level retrieval). Their AAAK compression mode drops recall to 84.2%. Their benchmark tests raw ChromaDB retrieval — none of the Palace architecture (wings, rooms, closets) is exercised in the benchmark (source). MemoryPilot's scores are measured on a real multi-project memory base (500 memories across 6 projects) with all features active (GraphRAG, KG expansion, combinatorial reranker, importance scoring).
vs the best MCP memory servers:
| Feature | MemoryPilot v4.0 | MemPalace v3.1 | Mem0 |
|---|---|---|---|
| Search | Hybrid BM25 + multilingual-e5-small RRF (384-dim) | ChromaDB cosine (all-MiniLM-L6-v2) | Vector search (cloud API) |
| Embeddings | multilingual-e5-small (100+ languages, local ONNX) | all-MiniLM-L6-v2 (English only) | OpenAI API calls (external) |
| Multilingual | 100+ languages native (FR, EN, ES, DE, JA, ZH...) | English only | Depends on API |
| Knowledge Graph | Temporal triples with validity + confidence | Temporal triples (SQLite) | Basic graph (no temporal) |
| GraphRAG | Auto entity extraction + graph traversal + combinatorial reranker | No | No |
| Chunked RAG | Transcript auto-chunking + auto-distillation (8 types) | Conversation chunking by exchange | No |
| Compression | AAAK compact dialect (~3x token savings) | AAAK dialect (experimental, regresses recall to 84.2%) | No |
| Person detection | Auto-detects team members from text | No | No |
| Self-Healing | Background auto-linting loop | No | No |
| Garbage collection | Heuristic merge + scoring + orphan cleanup | No | Basic TTL |
| Project brain | Yes, with team members (<1500 tokens) | No | No |
| File watcher | Context boost from recent edits | No | No |
| Deduplication | Content hash (exact) + Jaccard 85% (fuzzy) | Basic hash | Embedding similarity |
| HTTP API | Multi-threaded REST server (optional) | No | Cloud hosted |
| Memory types | 13 types, importance 1-5 | Wings/Rooms hierarchy | 1 type |
| MCP tools | 29 tools | 19 tools | N/A |
| Privacy | 100% local, zero API calls | 100% local | Cloud dependent |
| Language | Rust (single binary, zero deps) | Python (pip install) | SaaS |
| Startup | 1-2 ms | ~5 ms | N/A (cloud) |
| Binary | 22 MB single binary | Python + ChromaDB (~500 MB installed) | SaaS |
| Storage | SQLite WAL + FTS5 + connection pool | ChromaDB | Cloud DB |
| Concurrency | Lazy embedding thread + read pool + debounced cleanup | Single-threaded | Single-threaded |
The 8 Pillars
1. Hybrid Search (BM25 + fastembed RRF)
Every memory gets a 384-dimension transformer embedding on insert via fastembed (multilingual-e5-small, local ONNX inference — supports 100+ languages including French, English, Spanish, German, Japanese, Chinese — no API calls, no external services). Search runs both BM25 full-text and cosine similarity in parallel, then merges results with Reciprocal Rank Fusion.
Results are boosted by importance weighting, knowledge graph link density, file watcher context, and penalized for expired knowledge triples.
Performance optimizations:
- Lazy embedding:
add_memoryreturns instantly, embeddings computed in background thread - LRU cache (64 entries): repeated search queries skip embedding computation
- Read connection pool (4 connections): concurrent vector searches don't block writes
- Content hashing (FNV-1a): backfill skips unchanged memories
2. Temporal Knowledge Graph
A full knowledge graph with temporal validity. Facts have valid_from / valid_to dates and confidence scores. When facts become outdated, they are invalidated rather than deleted — giving the AI a timeline of how knowledge evolved.
Entities (technologies, files, components, people) are automatically extracted from memory content and linked bidirectionally. Search results from memories with all-expired triples are penalized.
5 dedicated KG tools: kg_add, kg_invalidate, kg_query, kg_timeline, kg_stats
3. GraphRAG
Every memory is automatically analyzed for entities: technologies, file paths, components, projects, and people. Entities are stored in a dedicated table. Memories sharing entities are auto-linked with inferred relationship types (resolves, implements, depends_on, deprecates...).
When searching, MemoryPilot traverses the knowledge graph from the top matches to pull in related context — e.g., finding the architecture decision that led to a specific bug fix. A combinatorial reranker then selects the best cluster of connected memories rather than independent top-K results, producing cohesive context (94% cluster coherence). Tuned RRF fusion (k=40), exact term coverage boost, smart FTS tokenization, query-time KG expansion, temporal recency, and importance tiebreakers push NDCG@10 to 94% with perfect R@5/R@10.
4. Chunked RAG (Transcripts)
Save full conversation transcripts without polluting the LLM context window. The add_transcript tool automatically chunks large texts into ~2000 character blocks and links them together. Chunks are excluded from recall but fully searchable.
Auto-distillation extracts structured memories from transcripts: decision, preference, todo, bug, milestone, problem, and note. Smart disambiguation: a segment mentioning both a bug and its resolution is classified as milestone, not bug.
Supports session_id, thread_id, window_id for multi-window memory scoping.
5. AAAK Compression
Inspired by MemPalace's symbolic memory language. When compact: true is passed to recall or get_project_brain, output is compressed ~3x using a terse, pipe-separated format:
[DEC:5] Use Clerk over Auth0 | tags:auth,stack | proj:MyApp
[PREF:4] Always use TypeScript strict mode | tags:typescript
6. Self-Healing (Auto-Linter)
MemoryPilot watches your files. When you save a Rust, Svelte, or TypeScript file, it lints in the background. Compilation errors are automatically stored as bug memories with the exact stack trace. When the error is fixed, the memory is auto-deleted.
The linter thread reuses a single DB connection for its entire lifetime.
7. Garbage Collection
Old, low-importance memories are scored for cleanup candidacy. Groups of related stale memories are merged into condensed summaries using heuristic keyword extraction. Orphaned links and entities are cleaned. DB is vacuumed after significant deletions.
8. Project Brain
One tool call returns a dense JSON snapshot of a project under 1500 tokens: tech stack, architecture decisions, active bugs, recent changes, key components, and team members (auto-detected person entities). Supports compact: true for AAAK compression.
Install
One-liner (recommended)
git clone https://github.com/Soflution1/MemoryPilot.git && cd MemoryPilot && ./install.sh
The installer builds MemoryPilot, installs the binary to ~/.local/bin/, detects your IDEs, and configures each one automatically.
Supported IDEs:
| IDE | Config file | Auto-configured |
|---|---|---|
| Cursor | ~/.cursor/mcp.json |
✓ (stdio) |
| VS Code | ~/.vscode/mcp.json |
✓ (stdio) |
| Claude Desktop | ~/Library/Application Support/Claude/claude_desktop_config.json |
✓ (stdio) |
| Windsurf | ~/.codeium/windsurf/mcp_config.json |
✓ (stdio) |
| Claude Code | claude mcp add |
✓ (CLI) |
| Codex | codex mcp add |
✓ (CLI) |
| ChatGPT Desktop | Settings → Apps → Create | via HTTP (see below) |
The script is idempotent — run it again to update without breaking existing MCP configs.
ChatGPT Desktop
ChatGPT requires a remote MCP endpoint. Start the HTTP server, then add it as a custom connector:
MemoryPilot --http 7437
In ChatGPT: Settings → Apps → Create → URL: http://localhost:7437/mcp
Manual install
git clone https://github.com/Soflution1/MemoryPilot.git
cd MemoryPilot
cargo build --release --features http
cp target/release/MemoryPilot ~/.local/bin/
chmod +x ~/.local/bin/MemoryPilot
xattr -cr ~/.local/bin/MemoryPilot # macOS only
Then add MemoryPilot to your IDE's MCP config manually (see table above for file paths).
How it works
That's it. MemoryPilot automatically injects a dynamic System Prompt into your IDE on startup. The AI will proactively call add_memory in the background to store your architecture decisions, API keys, and bug fixes without manual intervention. All configured IDEs share the same memory database.
For ChatGPT or any MCP client that needs HTTP: run MemoryPilot --http to expose the Streamable HTTP endpoint at /mcp.
Or use via McpHub for SSE transport with all your other MCP servers.
First run
# If upgrading from v1 (JSON files):
MemoryPilot --migrate
# Compute embeddings for existing memories:
MemoryPilot --backfill
# Force re-embed all (skips unchanged via content hash):
MemoryPilot --backfill-force
MCP Tools (29)
Core
| Tool | Description |
|---|---|
recall |
Start here. Loads all context in one shot: project memories, scoped thread/window memories, preferences, critical facts, patterns, decisions, global prompt. Supports mode = safe/default/full, compact = true for AAAK compression. |
get_project_brain |
Instant project summary (<1500 tokens): tech stack, architecture, bugs, recent changes, components, team members. Supports compact = true. |
search_memory |
Hybrid BM25 + fastembed RRF search, boosted by importance, graph links, and file watcher context. Batched triple scoring. |
get_file_context |
Memories related to recently modified files in working directory. |
Memory CRUD
| Tool | Description |
|---|---|
add_memory |
Store with lazy embedding, auto-dedup (hash exact + Jaccard 85%), auto entity extraction, auto graph linking. Importance 1-5, TTL. |
add_memories |
Bulk add multiple memories in one call with per-item dedup. |
add_transcript |
Store a long transcript as chunked archive, auto-distill structured memories (decision, preference, todo, bug, milestone, problem, note). |
get_memory |
Retrieve by ID. |
update_memory |
Update content, kind, tags, importance, TTL. Skips re-embedding if content unchanged (hash check). |
delete_memory |
Delete by ID (cascades to entities and links). |
list_memories |
List with project/kind filters and pagination. |
Knowledge Graph
| Tool | Description |
|---|---|
kg_add |
Add a fact triple (subject → predicate → object) with optional validity period and confidence score. |
kg_invalidate |
Mark a triple as expired (sets valid_to), preserving history. |
kg_query |
Query all triples related to an entity, with temporal filtering and direction control. |
kg_timeline |
Chronological history of all triples involving an entity. |
kg_stats |
Summary statistics: total triples, active, expired, unique subjects/objects. |
Project & Config
| Tool | Description |
|---|---|
get_project_context |
Full project context with preferences and patterns. |
register_project |
Register project with filesystem path for auto-detection. |
list_projects |
List projects with memory counts. |
get_stats |
DB statistics: totals, by kind, by project, DB size, hygiene signals. |
get_global_prompt |
Auto-discover GLOBAL_PROMPT.md from ~/.MemoryPilot/ or project root. |
export_memories |
Export as JSON or Markdown with importance stars. |
set_config |
Set config values (e.g. global_prompt_path). |
Maintenance
| Tool | Description |
|---|---|
run_gc |
Garbage collection: merge old memories, clean orphans, vacuum. Supports dry_run. |
cleanup_expired |
Remove expired TTL memories (debounced — runs max once per 60s). |
benchmark_recall |
Recall quality benchmark with golden scenarios. |
benchmark_search |
Search quality benchmark: R@5, R@10, NDCG@10, cluster coherence, latency. |
migrate_v1 |
Import from v1 JSON files. |
Memory Types
fact · preference · decision · pattern · snippet · bug · credential · todo · note · milestone · architecture · problem · transcript_chunk
Each memory has importance (1-5), optional TTL, tags, project scope, content hash, and auto-generated embedding + entity links.
CLI
MemoryPilot # Start MCP stdio server
MemoryPilot --backfill # Compute missing embeddings
MemoryPilot --backfill-force # Re-embed all (skips unchanged via hash)
MemoryPilot --benchmark-recall # Run recall quality benchmark
MemoryPilot --benchmark-search # Search quality: R@5, R@10, NDCG@10, cluster coherence
MemoryPilot --http 7437 # Start HTTP REST server (requires --features http)
MemoryPilot --migrate # Import v1 JSON data
MemoryPilot --version # Show version
MemoryPilot --help # Show help
HTTP API
When built with --features http, MemoryPilot exposes a multi-threaded REST API (4 worker threads, each with its own DB connection):
# Health check
curl http://localhost:7437/health
# Call any MCP tool
curl -X POST http://localhost:7437/tools/call \
-H 'Content-Type: application/json' \
-d '{"name": "search_memory", "arguments": {"query": "auth setup", "limit": 5}}'
Architecture
src/main.rs — CLI + MCP stdio server + file watcher init + HTTP server init
src/db.rs — SQLite engine: hybrid search, CRUD, KG, GC, brain, recall, lazy embed, connection pool
src/tools.rs — 29 MCP tool definitions + handlers
src/protocol.rs — JSON-RPC types
src/embedding.rs — fastembed (multilingual-e5-small) transformer embeddings, LRU cache
src/graph.rs — Entity extraction (tech, files, components, people) + relation inference + graph traversal
src/gc.rs — GC scoring, heuristic memory merging, stopwords
src/watcher.rs — File system watcher + auto-linter with persistent DB connection
src/http.rs — Optional multi-threaded HTTP REST server (feature-gated)
Database Schema
memories — id, content, kind, project, tags, importance, embedding (BLOB),
content_hash, expires_at, last_accessed_at, access_count, metadata
memories_fts — FTS5 virtual table (content, tags, kind, project)
memory_entities — memory_id, entity_kind, entity_value, valid_from, valid_to
memory_links — source_id, target_id, relation_type, valid_from, valid_to, confidence
knowledge_triples — id, subject, predicate, object, valid_from, valid_to, confidence, source_memory_id
projects — name, path, description
config — key/value store
Performance
| Metric | Value |
|---|---|
| Binary size | 22 MB |
| Startup | 1-2 ms |
| Search (hybrid RRF + reranker) | ~10 ms (500 memories) |
add_memory latency |
<1 ms (lazy embed) |
| Embedding quality | Transformer 384-dim (multilingual-e5-small, 100+ languages) |
| Backfill (1000 memories) | ~30s (skips unchanged via hash) |
| RAM | ~15 MB |
| Read concurrency | 4 pooled connections |
| Runtime dependencies | None (ONNX bundled) |
Optimizations
- Lazy embedding:
add_memoryinserts withNULLembedding, background thread computes and updates asynchronously - Content hashing (FNV-1a):
--backfill-forceskips memories whose content hasn't changed - LRU embedding cache (64 entries): repeated search queries reuse cached embeddings
- Read connection pool (4 connections): concurrent vector searches don't block writes
- WAL mode: SQLite Write-Ahead Logging for concurrent read/write
- Batched scoring: knowledge triple counts and link boosts fetched in single queries, not N+1
- Debounced cleanup: expired memory cleanup runs max once per 60 seconds
- Prepared statements: graph traversal prepares SQL once, not per node
- Tuned RRF fusion: k=40 for sharper top-K discrimination vs standard k=60
- Exact term coverage boost: +10% when 80%+ of query terms appear in memory content
- Combinatorial reranker: greedy subgraph selection, conservative +5% per connection (cap 15%)
- KG query expansion: post-retrieval scoring boost from knowledge graph related terms (+4% per entity, cap 15%)
- Temporal recency: gentle +5% for memories from last 3 days, decaying over 30 days
- Importance tiebreaker: ±3% per level — never overrides relevance signal
Run Benchmarks Yourself
MemoryPilot --benchmark-search --scenario-limit 30 # R@5, R@10, NDCG@10, cluster coherence, latency
MemoryPilot --benchmark-recall --scenario-limit 12 # top1/top5 hit rate, cross-project leak, credential safety
Storage
- Database:
~/.MemoryPilot/memory.db - Global prompt:
~/.MemoryPilot/GLOBAL_PROMPT.md - Fastembed model cache:
~/.fastembed_cache/(downloaded on first run)
License
Soflution Source Available License — free to use, not to fork or modify. See LICENSE for details.
Built by SOFLUTION LTD
Yorumlar (0)
Yorum birakmak icin giris yap.
Yorum birakSonuc bulunamadi