Waggle-mcp
Health Warn
- License — License: MIT
- No description — Repository has no description
- Active repo — Last push 0 days ago
- Low visibility — Only 5 GitHub stars
Code Pass
- Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Pass
- Permissions — No dangerous permissions requested
No AI report is available for this listing yet.
Your AI forgets everything between sessions. Waggle gives it a graph-backed brain.
Persistent, structured memory for AI agents — typically lower-token than chunk-based retrieval, often 2-4× on factual lookups.
Waggle is not a code indexer. It's a conversational memory engine — it remembers what you decided, why, and what changed, across every session.
What's New — v0.1.7
- Benchmark harness: end-to-end
WaggleAdapterconnecting the graph engine to ConvoMem / MemBench runners with automated exact-match scoring and latency logging. - LongMemEval integration: CLI-driven retrieval evaluation against the official LongMemEval split (
97.4% R@5/88.2% Exact@5ingraph_raw,96.4%/85.6%ingraph_hybrid). - Logging utilities: structured log helpers (
logging_utils) for consistent, level-aware output across all subsystems. - Evidence tracking: new
evidence.pymodule records source provenance on stored nodes so reasoning chains are fully traceable. - Observability stack: Grafana dashboard, Prometheus config, and Docker Compose overlay in
deploy/observability/. - Kubernetes manifests: production-grade
deployment.yaml, network policy, external-secret, and certificate templates underdeploy/kubernetes/. - Operational runbooks: incident response, secret management, API-key rotation, and onboarding guides added to
docs/runbooks/.
Who is this for?
→ Individual developer extending Claude, Codex, Cursor, or Antigravity with persistent memory:
Use Python 3.11+ and install via pipx (no venv activation needed):brew install pipx && pipx ensurepath && pipx install waggle-mcp && waggle-mcp init.
SQLite + local embeddings, zero infra.
→ Team running a shared memory service: Waggle ships with a Docker image, Kubernetes manifests, Prometheus metrics, and multi-tenant auth. See deploy/kubernetes/ and docs/runbooks/.
Both paths share the same MCP tool surface — the difference is only the backend and transport.
Why waggle-mcp?
waggle-mcp is a local-first memory layer for MCP-compatible AI clients, built on a persistent knowledge graph.
| Stuffed context | Structured retrieval |
|---|---|
| Huge prompts every session | Compact subgraph retrieved at query time |
| Session-local memory | Persistent multi-session memory |
| Flat notes and chunks | Typed nodes and edges: decisions, reasons, contradictions |
| "What changed?" requires replaying logs | Temporal queries and diffs are first-class |
Waggle often uses materially fewer tokens than naive chunked retrieval on factual lookups, while graph-traversal queries intentionally spend more context to include reasoning chains such as updates, contradictions, and dependencies.
Architecture
flowchart LR
C["MCP Client\n(Claude/Cursor/Codex/Antigravity)"] --> S["waggle.server\nMCP tool surface"]
S --> G["Graph Engine\nMemoryGraph / Neo4jMemoryGraph"]
G --> DB["SQLite (local default)\nor Neo4j (service mode)"]
G --> E["Embeddings\n(sentence-transformers or deterministic fallback)"]
Quick start (Recommended)
The simplest way to use Waggle is via pipx. This installs the package in an isolated environment and makes the waggle-mcp command available globally without needing to manage a virtual environment (.venv) manually.
# 1. Install waggle globally
pipx install waggle-mcp
# 2. Run the interactive setup
waggle-mcp init
(If you don't have pipx, install it via brew install pipx && pipx ensurepath.)
Running init will detect your MCP client (Codex, Claude, Cursor, or Antigravity), write the necessary configuration, and initialize your local database. Restart your client, and you're ready to go.
Manual MCP setup examples for Codex, Claude Code, Cursor, and Antigravity are in docs/reference.md.
Comprehensive live feature run (full tool surface, multi-query graph tests, export/import validation):tests/artifacts/test-run/comprehensive_feature_demo.md
⚠️ Edges are what make graph memory work.
observe_conversationanddecompose_and_storecreate edges automatically.
If you only callstore_node, you get isolated facts — not a connected graph.
Always preferobserve_conversationfor conversational ingestion.
Setting Up waggle as an MCP Server
One-time install:
pipx install waggle-mcp(requires Python 3.11+; recommended on macOS/Homebrew Python) — no API key, no cloud account, no Docker required for local use.
Use this shared JSON config shape for clients that accept mcpServers JSON (recommended when installed via pipx):
{
"mcpServers": {
"waggle": {
"command": "waggle-mcp",
"args": ["serve"],
"env": {
"WAGGLE_TRANSPORT": "stdio",
"WAGGLE_BACKEND": "sqlite",
"WAGGLE_DB_PATH": "~/.waggle/memory.db",
"WAGGLE_DEFAULT_TENANT_ID": "local-default",
"WAGGLE_MODEL": "all-MiniLM-L6-v2"
}
}
}
}
Claude Desktop / Antigravity / Cursor / Claude Code setup details
Claude Desktop config file location
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json
Antigravity
- Open agent panel →
···→ Manage MCP Servers → View raw config - Paste the same JSON block above.
Cursor
Cursor Settings -> Features -> MCP Servers -> + Add- Command:
waggle-mcp - Args:
serve - Env vars: same keys as the JSON block above.
Claude Code
claude mcp add waggle \
--env WAGGLE_TRANSPORT=stdio \
--env WAGGLE_BACKEND=sqlite \
--env WAGGLE_DB_PATH=~/.waggle/memory.db \
--env WAGGLE_DEFAULT_TENANT_ID=local-default \
--env WAGGLE_MODEL=all-MiniLM-L6-v2 \
-- waggle-mcp serve
Codex
Add to ~/.codex/config.toml:
[mcp_servers.waggle]
command = "waggle-mcp"
args = ["serve"]
env = {
WAGGLE_TRANSPORT = "stdio",
WAGGLE_BACKEND = "sqlite",
WAGGLE_DB_PATH = "~/.waggle/memory.db",
WAGGLE_DEFAULT_TENANT_ID = "local-default",
WAGGLE_MODEL = "all-MiniLM-L6-v2"
}
waggle-mcp not on PATH?
If you installed with pipx, ensure its bin path is available:
pipx ensurepath
Then restart your terminal/client. If you're using a venv-based install, use the venv interpreter path instead of waggle-mcp:
which python3 # macOS / Linux
where python # Windows
e.g. /usr/local/bin/python3 or C:\Python311\python.exe.
Verify it works
After restarting your client, ask the agent:
"Store a note: we're using PostgreSQL for this project."
Then open a fresh session and ask:
"What database are we using?"
Expected result (example):
You're using PostgreSQL for this project.
If you see that kind of recall in a new session, you're live.
Quick-reference tool table
| Ask the agent… | Tool called |
|---|---|
| "Remember that…" | observe_conversation |
| "What do you know about X?" | query_graph |
| "What changed recently?" | graph_diff |
| "Summarize context for a new session" | prime_context |
| "Show all stored topics" | get_topics |
| "Export my memory to a file" | export_graph_backup |
For the full tool surface and environment variable reference see docs/reference.md.
CLI Command Reference
Waggle includes a built-in CLI for setup, maintenance, and learning the memory system.
| Command | Description |
|---|---|
waggle-mcp --help |
Show all available commands, options, and usage examples. |
waggle-mcp features |
Recommended — Explain the main tools, graph workflows, and how connected context reaches the model. |
waggle-mcp init |
Interactive setup wizard to configure Codex, Claude, Cursor, or Antigravity. |
waggle-mcp serve |
Run the MCP server (usually started automatically by your client). |
waggle-mcp export-context-bundle |
Export a portable Markdown/JSON context pack for another AI. |
waggle-mcp export-markdown-vault |
Export your memory graph as an Obsidian-style vault. |
For advanced commands (tenant management, API keys, Neo4j migration), see the full help output:
waggle-mcp --help
Cross-Client Handoffs & Migration
Waggle is designed to be a "portable brain" for your AI sessions. Whether you are switching editors (e.g., Antigravity to Codex) or moving across machines, your memory can follow you.
1. Automatic Sharing (Same Machine)
If you run multiple MCP clients (like Codex and Antigravity) on the same machine, they can share a single "brain" automatically.
- How: Ensure both clients use the same
WAGGLE_DB_PATHin their environment configuration (default is~/.waggle/memory.db). - Result: A decision made in one editor is immediately known by the agent in the other.
2. Session Handoffs (Context Bundles)
If you hit a session limit or want to jump to a fresh context while keeping important facts:
# Export a condensed, AI-ready summary of your current project context
waggle-mcp export-context-bundle --format markdown --output-path ./handoff.md
Paste the contents of handoff.md into your new session to "re-prime" the AI with your project's history.
3. Full Memory Migration (Backup/Import)
To move your entire memory history to a new machine:
- Export:
waggle-mcp export-graph-backup --output-path my_memory.json - Import:
waggle-mcp import-graph-backup --input-path my_memory.json
Using It In MCP Clients
Once installed, you usually do not run waggle-mcp commands by hand during daily work. Talk to the agent normally, and it calls Waggle MCP tools to store and retrieve memory.
- Codex / Claude Code:
observe_conversation,query_graph, andprime_contextare called automatically during normal threads. - Cursor: decisions and facts can be persisted as graph memory instead of getting lost in old chat windows.
- Antigravity: conversation turns can be extracted via
observe_conversation; context can be exported withexport_context_bundle.
See it in action
How It Works (Interaction Flow)
User -> Agent -> observe_conversation(...) -> Graph stores typed nodes + edges
User -> Agent -> query_graph("database") -> Subgraph returned -> Agent answers with linked rationale
Session 1 — April 10
User: Let's use PostgreSQL. MySQL replication has been painful.
Agent: [calls observe_conversation()]
→ stores decision node: "Chose PostgreSQL over MySQL"
→ stores reason node: "MySQL replication painful"
→ links them with a depends_on edge
Session 2 — April 12 (fresh context window, no history)
User: What did we decide about the database?
Agent: [calls query_graph("database decision")]
→ retrieves the decision node + linked reason from April 10
"You decided on PostgreSQL on April 10. The reason recorded was
that MySQL replication had been painful."
Session 3 — April 14
User: Actually, let's reconsider — the team is more familiar with MySQL.
Agent: [calls store_node() + store_edge(new_node → old_node, "contradicts")]
→ both positions are preserved, and the contradiction is explicit
Knowledge graph visual (example)
graph TD
D1["Decision: Use PostgreSQL"]
R1["Reason: MySQL replication pain"]
D2["Decision update: reconsider MySQL"]
P1["Preference: dark mode UI"]
N1["Note: add integration tests"]
D1 -- "depends_on" --> R1
D2 -- "contradicts" --> D1
N1 -- "relates_to" --> D2
P1 -- "part_of project context" --> D1
Key Features
- Automatic Extraction:
observe_conversationingests facts into the graph without manual schema work. - Portable Context:
export_context_bundlegenerates Markdown/JSON context packs for another AI. - Vault Round-trip:
export_markdown_vault/import_markdown_vaultfor Obsidian-style node editing. - Conflict Resolution:
list_conflicts/resolve_conflictto manage contradictions without losing history. - Deterministic Fallback: Stable SHA-256 hashing for reliable, reproducible offline operation when transformer models are unavailable.
Security & Privacy
By default, data stays local on your machine (sqlite backend, local database path such as ~/.waggle/memory.db).
Waggle does not require telemetry or cloud calls for core local operation.
Your conversation memory only leaves your machine if you explicitly configure a remote backend or remote infrastructure.
Graph Data Model
Node Types
fact, entity, concept, preference, decision, question, note
Edge Types
relates_to, contradicts, depends_on, part_of, updates, derived_from, similar_to
Model Support
Waggle currently uses a local sentence-transformers embedding model selected by WAGGLE_MODEL.
- Default:
all-MiniLM-L6-v2 - Any locally available
sentence-transformersmodel name can be used. - If the selected model is unavailable locally, Waggle falls back to deterministic embeddings for portability.
Set model in env:
WAGGLE_MODEL=all-mpnet-base-v2 waggle-mcp serve
Set model in MCP client config (example):
{
"mcpServers": {
"waggle": {
"command": "python",
"args": ["-m", "waggle.server"],
"env": {
"WAGGLE_MODEL": "all-mpnet-base-v2"
}
}
}
}
Notes:
- Waggle does not currently route to hosted embedding providers directly; embedding inference is local to the runtime.
- Deterministic mode is useful for offline/testing portability, but semantic retrieval quality is lower than transformer mode.
Performance Snapshot
| Operation | Time | Notes |
|---|---|---|
observe_conversation |
~1.54 ms (mean) | Single conversation turn ingestion, local sqlite + deterministic embeddings |
query_graph |
~1.60 ms (mean) | Subgraph retrieval (max_nodes=12, max_depth=2) |
graph_diff |
~0.80 ms (mean) | Temporal diff over local graph |
| Context tokens (comparative mean) | 56.3 vs 150.2 |
Waggle vs naive RAG baseline (~2.7x lower-token) |
Sources: performance_snapshot.md, benchmark_current.md
Example: Retrieving a database decision stored days ago uses about
56tokens from a Waggle subgraph vs about150tokens from naive context replay (~2.7xlower-token).
Benchmarks & Verification
External Benchmark — LongMemEval
LongMemEval session-retrieval results (500 questions):
| Method | R@5 | Exact@5 | Notes |
|---|---|---|---|
graph_raw |
97.4% |
88.2% |
Full split, no second-stage reranking (13/500 misses). Source: results_graph_raw.json |
graph_hybrid |
96.4% |
85.6% |
Full split with hybrid reranking (18/500 misses). Source: results_graph_hybrid.json |
Exact@5 is stricter than R@5 and is included here to show precision on support-session retrieval, not just any top-5 hit.
Important: on the current saved artifacts, raw retrieval outperforms hybrid reranking on both R@5 and Exact@5. We are treating this as a tuning target for v0.1.8 rather than changing defaults to a weaker mode.
Benchmark Policy
README benchmark claims in this repo are limited to Waggle runs with checked-in artifacts and reproducible commands.
Cross-project comparisons are intentionally excluded here unless they are strictly apples-to-apples on split, protocol, and scoring.
For exact setup details and verification snapshots, see tests/artifacts/README.md.
Internal Fixtures
| Area | Corpus | Result |
|---|---|---|
| Extraction | 25-case deterministic fixture | 100.0% (source: benchmark_current.md) |
| Retrieval | 18-query retrieval fixture | 83.3% Hit@k (source: benchmark_current.md) |
| Query stress | 40 adversarial retrieval-only cases | 97.5% Hit@k, 97.5% exact support (source: benchmark_current.md) |
| Deduplication | 22 cases (semi-semantic) | 0 false merges at threshold; 77.3% overall (source: benchmark_current.md) |
| Automated tests | Infrastructure & logic | 91 passed (source: pytest_test_benchmark_harness.txt) |
Deduplication note: Zero false-positive merges is the safety invariant. The 77.3% overall accuracy is intentionally conservative — the system prefers a missed merge over a wrong merge. Improving recall without introducing false positives is the active work for 0.1.8.
Detailed artifacts and methodology: Benchmark Methodology · tests/artifacts/README.md
Known Limitations
- Best on structured recall, weaker on answer synthesis: Waggle is strongest at "retrieve the right facts and relationships" — not at emitting a single benchmark-formatted final answer from memory.
- Edges are load-bearing:
observe_conversationanddecompose_and_storecreate them automatically. Rawstore_nodecalls without follow-up edges produce disconnected nodes with no traversal value. - Graph retrieval trades tokens for reasoning context: factual lookups are often cheaper than chunked RAG; graph-expansion queries intentionally spend more tokens to carry update chains and contradictions.
- Deduplication recall is conservative (77.3%): zero false-positive merges is maintained, but recall will improve in 0.1.8.
For operational details, scaling considerations, tool-level behavior, and the full MCP feature surface, see docs/reference.md.
Contributing
PRs and issues are welcome. See CONTRIBUTING.md.
Reference & Docs
Detailed reference material lives in external documentation:
- docs/reference.md: Environment variables, admin commands, Docker setup, and full tool surface.
- deploy/kubernetes/README.md: Production deployment.
- docs/runbooks/: Operations and troubleshooting.
- tests/artifacts/README.md: Benchmark artifacts and traceability.
License
MIT — see LICENSE.
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found