Midas
Health Uyari
- License — License: MIT
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Low visibility — Only 5 GitHub stars
Code Gecti
- Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Gecti
- Permissions — No dangerous permissions requested
Bu listing icin henuz AI raporu yok.
Local-first, eval-first memory for long-horizon AI agents — no LLM at ingest. Python SDK + MCP server with source-traceable recall, belief revision, selective forgetting, and reproducible benchmarks.
Midas
Local-first, eval-first memory for long-horizon AI agents — no LLM at ingest.
Midas is a small Python SDK (and an MCP server) that gives AI agents durable memory across long,
multi-session work — coding agents, research agents, assistants — without sending every turn through
an LLM to "extract" facts. It runs on your machine, costs nothing per message, and every recalled memory
is traceable to its source.
- No LLM at ingest or query → $0 API spend, zero data egress, fast local ops (no per-turn network round-trip; ingest is embed-bound, ~tens of ms).
- Auditable provenance → recall returns the source turns, not LLM-rewritten facts.
- Stays current and bounded → belief revision, selective forgetting + tiers, dedup — all no-LLM.
- Embeddable + store-agnostic → a library, not a SaaS. Bring your own embedder/store.
- Eval-first → every claim has a reproducible benchmark (BENCHMARKS.md).
Status: early. The API may change. Built narrow and measured-first.
How it works (in plain English)
Your AI assistant forgets everything between sessions — every new chat starts from zero. Midas is a
memory that lives next to your AI, on your computer. It does four simple things:
- Notices what matters. As you work, Midas saves the durable stuff — a decision, a fact about you, a
preference, a deadline — and ignores small talk. It judges "does this matter?" by reading the words
(names, numbers, dates make a turn important) — without calling another AI. - Hands the right notes back. Before the AI answers, Midas finds the handful of past notes related
to your question — by meaning, not exact keywords — and slips them into the prompt. - Keeps the notebook honest and tidy. When something changes ("actually, use Postgres now") it
updates the old note instead of keeping both; it merges duplicates; and it forgets old,
unimportant trivia so memory never bloats. - Stays yours. Everything is a local file — no cloud, no per-message AI bill — and every note links
back to the exact moment it came from, so you can always check why the AI "knows" something.
The trick that makes it cheap, private, and local: Midas never sends your conversation to an AI to
"process" it. It uses fast local math (embeddings — turning text into vectors and comparing them). The
only AI involved is the one you're already talking to.
Why "no LLM at ingest" matters: other memory tools call an LLM to summarize every session — you pay
in tokens forever, in latency, and by sending every turn to a provider. Midas trades that for cheap,
local, auditable retrieval.
See it remember across sessions — session 1 stores decisions; a fresh session 2 recalls them by
meaning:
Claude Code-style demo — the recalled lines (in green) are the real output Midas returned across two separate processes sharing one on-disk store.
Install
You need Python 3.11+. Check with python --version (or python3 --version). If you don't have it:
python.org/downloads, or winget install Python.Python.3.12
(Windows) · brew install [email protected] (macOS) · your package manager (Linux). The easiest installer for
everything below is uv (one line: see its site), but pip/pipx work
too.
A) To plug Midas into an AI tool (Claude Code, Cursor, …) — install the midas-mcp command
This puts a midas-mcp program on your PATH that any MCP client can launch — one line, no clone:
uv tool install "midas-memory[mcp,local]" # recommended (Windows, macOS, Linux)
# …or: pipx install "midas-memory[mcp,local]"
Where the command lands (you'll need this path for some clients):
| OS | midas-mcp location |
Find it with |
|---|---|---|
| Linux / macOS | ~/.local/bin/midas-mcp |
which midas-mcp |
| Windows | %USERPROFILE%\.local\bin\midas-mcp.exe |
where midas-mcp |
B) To use Midas as a Python library
pip install "midas-memory[all]" # SDK + local embeddings + MCP + LangGraph
# smaller: `pip install midas-memory` (core, zero deps) · `"…[local]"` (embeddings) · `"…[mcp]"`
(Want the source / to contribute? git clone https://github.com/vornicx/Midas && cd Midas && pip install -e ".[all,dev]".)
First run downloads the embedding model once (~90 MB,
bge-baseONNX), then works fully
offline. No API key, ever.
Verify:
which midas-mcp || where midas-mcp # the server command is installed
python -c "import midas; print('Midas', midas.__version__, 'OK')"
python quickstart.py # tiny end-to-end demo: remember → recall
Connect it to your coding agent
Midas is a standard MCP server. Every MCP client launches the same command — midas-mcp — and
passes a few environment variables. The only thing that differs between tools is where you put the
config. Use this block everywhere (swap in your real home path):
{
"mcpServers": {
"midas": {
"command": "midas-mcp",
"env": {
"MIDAS_MCP_EMBEDDER": "local",
"MIDAS_MCP_DB": "/home/you/.midas/memory.sqlite3",
"MIDAS_MCP_MAX_RECORDS": "50000",
"MIDAS_MCP_MIN_IMPORTANCE": "2"
}
}
}
}
⚠️ The #1 gotcha: GUI apps don't share your terminal's
PATH, so they may not findmidas-mcp.
If a client says "command not found", replace"command": "midas-mcp"with the absolute path
fromwhich midas-mcp(macOS/Linux) orwhere midas-mcp(Windows, e.g."C:/Users/you/.local/bin/midas-mcp.exe"— use forward slashes or\\in JSON). On Windows, write the
DB path with forward slashes too:C:/Users/you/.midas/memory.sqlite3.
Claude Code
Use the CLI (no file editing) — this is the exact command, verified:
claude mcp add midas -s user \
-e MIDAS_MCP_EMBEDDER=local \
-e MIDAS_MCP_DB="$HOME/.midas/memory.sqlite3" \
-e MIDAS_MCP_MAX_RECORDS=50000 \
-e MIDAS_MCP_MIN_IMPORTANCE=2 \
-- midas-mcp
claude mcp list # → midas: midas-mcp - ✓ Connected
-s user = available in all your projects · -s project = writes a shareable .mcp.json in the
repo · -s local = just you, this project. Remove with claude mcp remove midas -s user.
Cursor
Edit ~/.cursor/mcp.json (all projects) or .cursor/mcp.json (this project) and paste the JSON
block above. Then Cursor → Settings → MCP should show midas. Restart Cursor after changing env.
Claude Desktop
Settings → Developer → Edit Config opens the file (or edit it directly):
| OS | Path |
|---|---|
| macOS | ~/Library/Application Support/Claude/claude_desktop_config.json |
| Windows | %APPDATA%\Claude\claude_desktop_config.json |
Paste the JSON block, save, and restart Claude Desktop.
Codex CLI
Codex uses TOML, not JSON. Either run codex mcp add midas -- midas-mcp, or add this to~/.codex/config.toml:
[mcp_servers.midas]
command = "midas-mcp"
args = []
env = { MIDAS_MCP_EMBEDDER = "local", MIDAS_MCP_DB = "/home/you/.midas/memory.sqlite3", MIDAS_MCP_MAX_RECORDS = "50000", MIDAS_MCP_MIN_IMPORTANCE = "2" }
Start a session and run /mcp to confirm it's connected.
Windsurf
Edit the config (Cascade → MCP icon → Configure opens it), paste the JSON block, refresh:
| OS | Path |
|---|---|
| macOS / Linux | ~/.codeium/windsurf/mcp_config.json |
| Windows | %USERPROFILE%\.codeium\windsurf\mcp_config.json |
Anything else (VS Code, Cline, Zed, OpenAI Agents SDK…)
Same pattern: point it at command midas-mcp with those env vars (JSON clients reuse the block above).
What happens once it's connected
On connect, Midas injects a short memory policy into the agent (via the MCP instructions): recall
relevant memory first, then capture durable facts / decisions / preferences / constraints /
corrections as they come up. Every captured memory is tagged with provenance:planning, action, observation, or user_confirmation. The agent captures freely; Midas decides
what's actually kept — it scores importance (no LLM), drops trivia below MIDAS_MCP_MIN_IMPORTANCE
and skips duplicates, keeps memory current via typed belief revision, and keeps memory bounded viaMIDAS_MCP_MAX_RECORDS (forgetting low-value items, protecting durable facts). Restart the client (or
run /mcp) after editing config so it picks up the server.
Guard boundary: memory can guide planning, but it cannot by itself authorize external or destructive
actions. Before relying on memory to act outside the chat, call check_memory_use withintended_use="external_action" or "destructive_action". Those actions requireuser_confirmation provenance; otherwise the agent must ask the user to confirm in the current turn.
Tools it exposes: remember, capture (policy-gated auto-store), recall (source-traceable),build_context (budgeted prompt block), check_memory_use (Guard provenance boundary),memory_policy (exact injected policy text), maintain (dedup + forgetting, returns a deletion
audit), stats (counts + provenance + short/medium/long tiers), forget / forget_all. Env knobs:MIDAS_MCP_DB (persist to a SQLite file), MIDAS_MCP_EMBEDDER (local or hashing),MIDAS_MCP_MAX_RECORDS, MIDAS_MCP_MIN_IMPORTANCE, MIDAS_MCP_SUPERSEDE=0 to disable typed belief
revision, MIDAS_MCP_SUPERSEDE_CONVO=1 to allow strict-cue chat revision, MIDAS_MCP_NLI=1 to gate
revision with the local NLI model.
Use it from Python (the SDK)
from midas import Memory, LocalEmbedder, ContentImportance
# Real semantic memory, fully local. (Or just `Memory()` for a zero-setup offline hashing embedder.)
mem = Memory(embedder=LocalEmbedder(), importance_scorer=ContentImportance())
mem.remember("Decision: the primary database is PostgreSQL.", kind="constraint", importance=5)
mem.remember("The launch date moved to September 14.", kind="fact", importance=5)
mem.remember("haha yeah sounds good") # filler — auto-scored low-importance, first to be forgotten
# Budgeted, prompt-ready context — highest-value first, dated, source-traceable:
print(mem.assemble("When do we launch?", token_budget=128))
# Or structured, ranked hits, each traceable to its source:
for hit in mem.recall("which database did we pick?", limit=3):
print(f"{hit.score:.2f} {hit.record.content}")
# Auto-capture: forward a turn; Midas keeps it only if it clears the relevance policy (no LLM).
mem.capture("My deploy key expires on 2027-03-01.", kind="fact") # -> stored
mem.capture("lol ok cool") # -> skipped (below the floor)
# Provenance guard: observed memory is fine for planning, but not enough to deploy.
mem.remember("Deploy target is staging.", kind="constraint", provenance="observation")
decision = mem.guard_reliance("deploy target", intended_use="external_action")
assert not decision.allowed # ask the user to confirm before acting
Staying current and bounded — the long-horizon core
A multi-day agent's memory must stay current (no stale beliefs) and bounded (can't grow forever):
from midas.nli import LocalNLI
# Belief revision — a turn that CONTRADICTS an old belief supersedes it (local NLI, not keywords):
mem = Memory(embedder=LocalEmbedder(), supersede=True, supersede_conversational=True, nli=LocalNLI())
mem.forget_decayed(max_records=50_000) # evict lowest value (importance × recency); protects facts
mem.consolidate(similarity_threshold=0.95) # collapse near-duplicate restatements (keeps provenance)
mem.tier(record) # 'short' (≤1d) | 'medium' (≤1w) | 'long'
Forgetting returns the removed ids as a deletion audit trail and never drops the durable tier
(facts/preferences/constraints, high importance). Durable storage: Memory(store=SQLiteStore( "memory.db"), embedder=LocalEmbedder()) — a local file, no native extension.
Use with LangGraph
Back LangGraph's long-term memory with Midas (pip install ".[langgraph]"):
from midas.integrations.langgraph_store import MidasStore
store = MidasStore() # offline by default; pass Memory(embedder=LocalEmbedder(), ...) for semantic
store.put(("user", "123"), "pref", {"text": "prefers dark mode and concise answers"})
hits = store.search(("user", "123"), query="ui preferences")
Benchmarks
Midas leads on the reader-independent axes that isolate a memory layer's quality (full methodology +
reproduce commands in BENCHMARKS.md; anti-cheating checklist, failure cases, and
verbatim MCP policy in docs/methodology.md):
| baseline (recency window) | Midas | |
|---|---|---|
Retrieval — LongMemEval-s recall@k (evidence buried among distractors, n=40) |
0.03 | 0.95 |
| Retrieval — LoCoMo recall@k (5 conversations, n=50) | 0.02 | 0.85 |
Answer — LongMemEval-s correctness (reader = gpt-4.1-mini, n=40) |
0.05 | 0.82 |
| Ingest cost | — | 0 LLM calls · $0 API · 0 data egress |
We lead with retrieval and cost (deterministic, reader-independent) because end-to-end correctness on
these benchmarks is dominated by the reader LLM, not the memory layer. Head-to-head, same reader:
with gpt-4o, Midas scores 0.84 on LongMemEval-s — matching the LLM-ingest SOTA (Observational
Memory) while doing no LLM at ingest — and on a ~500-session haystack (~4,944 turns) it assembles a
bounded ~480-token context (recall@k 0.78), where keep-every-observation-in-context designs do not fit
by construction. (Same-reader, within-harness comparison — not a leaderboard rank; see BENCHMARKS.md.)
The eval harness
eval/ (dev-only) runs Midas and competitors through LoCoMo / LongMemEval / multiday /
conflicts-v1 with deterministic recall@k and precision@k, cost/latency instrumentation, an
optional local-or-hosted LLM judge, a deterministic dumb-reader ablation (--dumb-reader — proves
the numbers aren't reader-inflated), an adversarial conflicts benchmark (near-duplicates +
temporal conflicts), and a retention/forgetting measure with per-question success/failure traces:
python -m eval.runner --dataset longmemeval --variant s --local --midas-no-rerank --max-questions 40
python -m eval.runner --dataset longmemeval --variant s --local --dumb-reader --max-questions 40
python -m eval.runner --dataset multiday --dumb-reader # ctx_stale on leaderboard
python -m eval.runner --dataset conflicts --dumb-reader --midas-supersede
python -m eval.multiday --dataset conflicts --context-only --ab-supersede --midas-only
python -m eval.retention --dataset multiday --trace
python -m eval.retention --dataset multiday --trace --value-rank-only # forgetting failure mode
How the eval avoids the usual memory-stack cheats (no query rewriting, no LLM at ingest, no gold
leakage, seeded sampling), how conflicting memories are handled, and the exact MCP-injected policy
text — with real failure cases — are documented in docs/methodology.md.
Design concept
docs/long-horizon-memory.md — the north-star: the 4 C's
(Complete · Clean · Current · Calibrated), why multi-day accuracy is a belief-management problem, and
the honest, measured state of each piece (including the open frontiers).
docs/methodology.md — how the eval avoids the usual memory-stack cheats,
the dumb-reader ablation, conflicts-v1 stress tests, forgetting failure traces, supersession mechanics,
and the exact MCP-injected policy text (for external review / Reddit-style scrutiny).
Layout
midas/ # the SDK (importable; zero core dependencies)
memory.py # Memory: remember / capture / recall / build_context · forget_decayed · consolidate · tier
guard.py # Guard + Armorer: provenance tags · check_memory_use policy boundary
importance.py # ContentImportance — no-LLM per-turn salience · policy.py — MemoryPolicy + auto-memory prompt
nli.py # LocalNLI — local entailment/contradiction (belief revision + abstention)
embeddings.py # Hashing / Local (bge) / OpenAI · DiskCachedEmbedder · LocalReranker
store.py · sqlite_store.py · ann.py # in-memory cosine · persistent SQLite · IVF index
mcp_server.py # the MCP server
eval/ # dev-only benchmark harness (datasets · adapters · metrics · runner · multiday · retention)
docs/ # long-horizon-memory.md (design) · methodology.md (eval anti-cheating) · research-notes.md
Privacy
Midas is local-first: every memory lives in a SQLite file on your own machine, recall returns the
exact stored text, and capture/recall/forget make no network calls — your memories never leave
your computer. The developer collects no data; there is no account, API key, or telemetry. The only
outbound traffic is infrastructure (a one-time embedding-model download for the local backend, and
package install from PyPI), never your data. Full details: PRIVACY.md.
License
MIT.
Yorumlar (0)
Yorum birakmak icin giris yap.
Yorum birakSonuc bulunamadi