ALMA-memory

mcp
SUMMARY

Persistent memory for AI agents - Learn, remember, improve. Alternative to Mem0 with scoped learning, anti-patterns, multi-agent sharing, and MCP integration.

README.md

ALMA - Agent Learning Memory Architecture

PyPI version
npm version
Python 3.10+
License: MIT
CI
Documentation
Buy Me a Coffee

Your AI forgets everything. ALMA fixes that.

One memory layer. Every AI. Never start from zero.

pip install alma-memory — 5 minutes to persistent memory. Free forever on SQLite.

Documentation | Setup Guide | PyPI | npm


What Is ALMA?

ALMA is a Python library that gives AI agents permanent, searchable, compounding memory.

Every time you start a new Claude session, a new ChatGPT conversation, or spin up any AI agent — it starts from zero. Your context, your preferences, the solutions it already found, the mistakes it already made — gone. You repeat yourself hundreds of times a year.

ALMA sits between your AI and a database you control. Before every task, it retrieves what the agent learned from past runs. After every task, it stores the outcome. Over time, the system builds a growing knowledge base that makes every future conversation smarter than the last.

BEFORE TASK                        DURING TASK                      AFTER TASK
+----------------------------+     +-------------------------+     +---------------------------+
| Retrieve relevant memories |     | Agent executes with     |     | Learn from outcome        |
| - Past strategies          | --> | injected knowledge      | --> | - Success? New heuristic  |
| - Anti-patterns            |     | from memory             |     | - Failure? Anti-pattern   |
| - Domain knowledge         |     |                         |     | - Always: Knowledge grows |
+----------------------------+     +-------------------------+     +---------------------------+
                         Every conversation makes the next one better.

It's not a service. It's a library. You install it, connect it to your own database (even a local SQLite file), and your AI agents start remembering.


Why Not Just Use Claude's Memory or ChatGPT Memory?

Claude Projects, ChatGPT Memory, and Gemini's context all have built-in memory. Here's what they can't do:

What You Need Built-in AI Memory ALMA
Memory across different AIs Locked to one platform. Claude doesn't know what you told ChatGPT. One memory layer shared across every AI tool you use.
Memory that learns from outcomes Stores conversations, not lessons. Doesn't know what worked vs. failed. Tracks success/failure per strategy. Recommends what actually works.
Anti-pattern tracking No concept of "what NOT to do." Explicit anti-patterns with why_bad + better_alternative.
Scoped learning All-or-nothing context window. Agents only learn within defined domains (can_learn / cannot_learn).
Multi-agent knowledge sharing Each assistant is isolated. Agents inherit knowledge from senior agents. Teams share across roles.
Your data, your infrastructure Stored on their servers. You can't export it. Your database, your rules. SQLite, PostgreSQL, Qdrant, Pinecone — you choose.
Scoring beyond similarity "Most recent" or basic relevance. 4-factor scoring: similarity + recency + success rate + confidence.
Memory lifecycle Grows until you manually prune. Automatic decay, compression, consolidation, and archival.
Workflow context Stateless between sessions. Checkpoints, state merging, and scoped retrieval across workflow runs.

The fundamental difference: Built-in AI memory stores conversations. ALMA stores intelligence — what worked, what failed, what to avoid, and why.

Week 1, basic retrieval works. Week 4, patterns emerge across sessions. Week 12, cross-domain connections surface automatically. Week 52, the system knows your work better than any single conversation ever could.


What ALMA Can Do

Capability Description
5 Memory Types Heuristics (strategies), outcomes (results), preferences (user constraints), domain knowledge (facts), anti-patterns (what not to do)
7 Storage Backends SQLite+FAISS, PostgreSQL+pgvector, Qdrant, Pinecone, Chroma, Azure Cosmos DB, File
4 Graph Backends Neo4j, Memgraph, Kuzu, In-memory — entity relationship tracking
22 MCP Tools Native Claude Code / Claude Desktop integration via stdio or HTTP
RAG Bridge Enhance any RAG system (LangChain, LlamaIndex) with memory signals and feedback loops
Multi-Factor Scoring Similarity + recency + success rate + confidence — not just vector distance
Multi-Agent Sharing Hierarchical knowledge sharing with inherit_from + share_with
Memory Lifecycle Decay, compression, consolidation, archival, verified retrieval
Workflow Context Checkpoints, state merging, artifacts, scoped retrieval
Event System Webhooks + in-process callbacks for real-time memory reactions
Domain Factory 6 pre-built schemas: coding, research, sales, support, content, general
TypeScript SDK Full-featured JavaScript/TypeScript client library
Scoped Learning Agents only learn within defined domains — prevents knowledge contamination

Installation

pip install alma-memory

ALMA needs a database to store memories. Follow the Setup Guide (GUIDE.md) for complete database setup instructions — covers every backend, written for all experience levels.

The simplest option uses SQLite — it runs locally on your machine with zero setup:

pip install alma-memory[local]     # SQLite + FAISS + local embeddings — nothing else to install

For production or cloud-hosted memory:

pip install alma-memory[postgres]  # PostgreSQL + pgvector
pip install alma-memory[qdrant]    # Qdrant vector database
pip install alma-memory[pinecone]  # Pinecone vector database
pip install alma-memory[chroma]    # ChromaDB
pip install alma-memory[azure]     # Azure Cosmos DB
pip install alma-memory[rag]       # RAG integration (hybrid search, reranking)
pip install alma-memory[all]       # Everything

TypeScript/JavaScript:

npm install @rbkunnela/alma-memory

Need help setting up a database? See GUIDE.md for step-by-step instructions — from local SQLite (zero infrastructure) to cloud PostgreSQL (free tier available). Written for all experience levels.


Quick Start

1. Create a config file

# .alma/config.yaml
alma:
  project_id: "my-project"
  storage: sqlite
  embedding_provider: local
  storage_dir: .alma
  db_name: alma.db
  embedding_dim: 384

2. Use ALMA in your code

from alma import ALMA

# Initialize from config
alma = ALMA.from_config(".alma/config.yaml")

# Before task: Get relevant memories
memories = alma.retrieve(
    task="Test the login form validation",
    agent="helena",
    top_k=5
)

# Inject into your AI prompt
prompt = f"""
## Your Task
Test the login form validation

## Knowledge from Past Runs
{memories.to_prompt()}
"""

# After task: Learn from the outcome
alma.learn(
    agent="helena",
    task="Test login form",
    outcome="success",
    strategy_used="Tested empty fields, invalid email, valid submission",
)

# Next time: Helena remembers what worked

3. That's it

Every time Helena runs, she retrieves what worked before and learns from new outcomes. No manual prompt engineering. No copy-pasting from past conversations. The memory compounds automatically.


Five Memory Types

ALMA doesn't just store text. It categorizes knowledge into five types that serve different purposes:

Type What It Stores Example
Heuristic Strategies that worked "For forms with >5 fields, test validation incrementally"
Outcome Task results (success/failure) "Login test succeeded using JWT token strategy — 340ms"
Preference User constraints "User prefers verbose test output, dark theme, Python 3.12"
Domain Knowledge Accumulated facts "Login uses OAuth 2.0 with 24h token expiry"
Anti-Pattern What NOT to do "Don't use sleep() for async waits — causes flaky tests. Use explicit waits instead."

Anti-patterns are the feature no other memory system has. When your AI makes a mistake, ALMA records what went wrong, why it's bad, and what to do instead. Next time, it knows to avoid that path before it starts.


Multi-Agent Memory Sharing

Agents don't have to learn everything from scratch. Junior agents can inherit knowledge from senior agents:

agents:
  senior_dev:
    can_learn: [architecture, best_practices]
    share_with: [junior_dev, qa_agent]

  junior_dev:
    can_learn: [coding_patterns]
    inherit_from: [senior_dev]
# Junior dev retrieves memories — including senior's shared knowledge
memories = alma.retrieve(
    task="Implement user authentication",
    agent="junior_dev",
    include_shared=True
)

Storage Backends

ALMA is a library, not a service. You choose where your data lives:

Backend Best For Vector Search Production Ready
SQLite + FAISS Local development, offline Yes Yes
PostgreSQL + pgvector Production, high availability Yes (HNSW) Yes
Qdrant Managed vector DB Yes (HNSW) Yes
Pinecone Serverless vector DB Yes Yes
Chroma Lightweight local Yes Yes
Azure Cosmos DB Enterprise, Azure-native Yes (DiskANN) Yes
File-based Testing, CI/CD No No
Platform Backend Starting Cost
Your Laptop SQLite+FAISS $0.00
Supabase PostgreSQL+pgvector $0.00 (free tier)
AWS / GCP / Azure PostgreSQL, Qdrant, Pinecone, Cosmos DB Varies
Self-hosted PostgreSQL+pgvector $5-10/mo

Step-by-step database setup for every backend: See GUIDE.md


MCP Server Integration

Connect ALMA directly to Claude Code or Claude Desktop with 22 MCP tools:

python -m alma.mcp --config .alma/config.yaml
// .mcp.json (for Claude Code)
{
  "mcpServers": {
    "alma-memory": {
      "command": "python",
      "args": ["-m", "alma.mcp", "--config", ".alma/config.yaml"]
    }
  }
}

22 MCP Tools — retrieve memories, learn from outcomes, manage preferences, checkpoint workflows, consolidate memories, verified retrieval, compression, and more. Every ALMA feature accessible from Claude's tool system.


RAG Integration

Enhance any RAG framework with ALMA memory signals:

from alma import ALMA, RAGBridge, RAGChunk

alma = ALMA.from_config(".alma/config.yaml")
bridge = RAGBridge(alma=alma)

# Your RAG system retrieves chunks (LangChain, LlamaIndex, etc.)
chunks = [
    RAGChunk(id="1", text="Deploy with blue-green strategy", score=0.85),
    RAGChunk(id="2", text="Use rolling updates for zero downtime", score=0.78),
]

# ALMA enhances with memory signals — past success/failure data
result = bridge.enhance(
    chunks=chunks,
    query="how to deploy auth service safely",
    agent="backend-agent",
)

Includes hybrid search (vector + keyword with RRF fusion), feedback loops for auto-tuning retrieval weights, and IR metrics (MRR, NDCG, Recall, MAP).


Graph Memory

Track entity relationships alongside vector memory:

from alma.graph import create_graph_backend, BackendGraphStore, EntityExtractor

backend = create_graph_backend("neo4j", uri="neo4j+s://...", username="neo4j", password="...")
graph = BackendGraphStore(backend)
extractor = EntityExtractor()

entities, relationships = extractor.extract(
    "Alice from Acme Corp reviewed the PR that Bob submitted."
)

for entity in entities:
    graph.add_entity(entity)
for rel in relationships:
    graph.add_relationship(rel)

Four backends: Neo4j (production), Memgraph (streaming), Kuzu (embedded), In-memory (testing).


Event System

React to memory changes in real-time:

from alma.events import get_emitter, MemoryEventType

def on_memory_created(event):
    print(f"Memory created: {event.memory_id} by {event.agent}")

emitter = get_emitter()
emitter.subscribe(MemoryEventType.CREATED, on_memory_created)

Supports webhooks with retry logic, HMAC signature verification, and 5 event types: CREATED, UPDATED, DELETED, ACCESSED, CONSOLIDATED.


Architecture

+-------------------------------------------------------------------------+
|                          ALMA v0.8.0                                    |
+-------------------------------------------------------------------------+
|  HARNESS LAYER                                                          |
|  +-----------+  +-----------+  +-----------+  +----------------+        |
|  | Setting   |  | Context   |  |  Agent    |  | MemorySchema   |        |
|  +-----------+  +-----------+  +-----------+  +----------------+        |
+-------------------------------------------------------------------------+
|  EXTENSION MODULES                                                      |
|  +-------------+  +---------------+  +------------------+               |
|  | Progress    |  | Session       |  | Domain Memory    |               |
|  | Tracking    |  | Handoff       |  | Factory          |               |
|  +-------------+  +---------------+  +------------------+               |
|  +-------------+  +---------------+  +------------------+               |
|  | Auto        |  | Confidence    |  | Memory           |               |
|  | Learner     |  | Engine        |  | Consolidation    |               |
|  +-------------+  +---------------+  +------------------+               |
|  +-------------+  +---------------+                                     |
|  | Event       |  | TypeScript    |                                     |
|  | System      |  | SDK           |                                     |
|  +-------------+  +---------------+                                     |
+-------------------------------------------------------------------------+
|  CORE LAYER                                                             |
|  +-------------+  +-------------+  +-------------+  +------------+      |
|  | Retrieval   |  |  Learning   |  |  Caching    |  | Forgetting |      |
|  |  Engine     |  |  Protocol   |  |   Layer     |  | Mechanism  |      |
|  +-------------+  +-------------+  +-------------+  +------------+      |
+-------------------------------------------------------------------------+
|  STORAGE LAYER                                                          |
|  +---------------+  +------------------+  +---------------+             |
|  | SQLite+FAISS  |  | PostgreSQL+pgvec |  | Azure Cosmos  |             |
|  +---------------+  +------------------+  +---------------+             |
|  +---------------+  +------------------+  +---------------+             |
|  |    Qdrant     |  |    Pinecone      |  |    Chroma     |             |
|  +---------------+  +------------------+  +---------------+             |
+-------------------------------------------------------------------------+
|  GRAPH LAYER                                                            |
|  +---------------+  +------------------+  +---------------+             |
|  |    Neo4j      |  |    Memgraph      |  |     Kuzu      |             |
|  +---------------+  +------------------+  +---------------+             |
|  +---------------+                                                      |
|  |   In-Memory   |                                                      |
|  +---------------+                                                      |
+-------------------------------------------------------------------------+
|  INTEGRATION LAYER                                                      |
|  +-------------------------------------------------------------------+  |
|  |                         MCP Server                                 |  |
|  +-------------------------------------------------------------------+  |
+-------------------------------------------------------------------------+

Configuration

# .alma/config.yaml
alma:
  project_id: "my-project"
  storage: sqlite  # sqlite | postgres | qdrant | pinecone | chroma | azure | file
  embedding_provider: local  # local | azure | mock
  storage_dir: .alma
  db_name: alma.db
  embedding_dim: 384

  agents:
    helena:
      domain: coding
      can_learn:
        - testing_strategies
        - selector_patterns
      cannot_learn:
        - backend_logic
      min_occurrences_for_heuristic: 3
      share_with: [qa_lead]

    victor:
      domain: coding
      can_learn:
        - api_patterns
        - database_queries
      inherit_from: [senior_architect]

For full backend configuration (PostgreSQL, Qdrant, Pinecone, Chroma, Azure, embedding providers): See GUIDE.md


Comparisons

ALMA vs Mem0, LangChain Memory, and Graphiti
Feature ALMA Mem0 LangChain Graphiti
Memory Scoping can_learn / cannot_learn per agent Basic isolation Session-based None
Anti-Pattern Learning why_bad + better_alternative None None None
Multi-Agent Sharing inherit_from + share_with None None None
Multi-Factor Scoring 4 factors (similarity + recency + success + confidence) Similarity only Similarity only Similarity only
MCP Integration 22 tools None None None
Workflow Checkpoints Full checkpoint/resume/merge None None None
TypeScript SDK Full-featured client None JavaScript wrappers None
Graph + Vector Hybrid 4 graph + 7 vector backends Limited Limited Graph-focused
Memory Consolidation LLM-powered deduplication Basic None None
Event System Webhooks + in-process callbacks None None None
Domain Factory 6 pre-built schemas None None None

The key difference: Most solutions treat memory as "store embeddings, retrieve similar." ALMA treats it as "teach agents to improve within safe boundaries."


Release History

v0.8.0 - RAG Integration Layer
  • RAG Bridge: Accept chunks from any RAG framework and enhance with memory signals
  • Hybrid Search: Vector + keyword with RRF fusion
  • Feedback Loop: Track and auto-tune retrieval weights
  • IR Metrics: MRR, NDCG, Recall, Precision, MAP
  • Cross-Encoder Reranking: Pluggable reranking pipeline
v0.7.x - Memory Wall + Intelligence Layer
  • Memory Decay: Time-based confidence decay
  • Memory Compression: LLM + rule-based summarization
  • Verified Retrieval: Two-stage verification pipeline
  • Retrieval Modes: 7 cognitive task modes
  • Trust-Integrated Scoring, Token Budget, Progressive Disclosure
  • 6 new MCP tools for Memory Wall
  • Archive System: Soft-delete with recovery
  • Embedding Performance Boost: 2.6x faster via batched processing + LRU cache
  • Storage Backend Factory, Consolidation Strategies, Standalone Dedup Engine
v0.6.0 - Workflow Context Layer
  • Checkpoint & Resume workflow state
  • State Reducers for parallel agent states
  • Artifact Linking to workflows
  • Scoped Retrieval by workflow/agent/project
  • 8 MCP Workflow Tools
  • TypeScript SDK v0.6.0 with full workflow API parity
v0.5.0 - Vector Database Backends
  • Qdrant, Pinecone, Chroma backends
  • Graph Database Abstraction (Neo4j, Memgraph, Kuzu, In-memory)
  • Testing Module (MockStorage, MockEmbedder, factories)
  • Memory Consolidation Engine
  • Event System (Webhooks + callbacks)
  • TypeScript SDK initial release
  • Multi-Agent Memory Sharing

See CHANGELOG.md for the complete history.


Roadmap

v0.9.0 — Personal Brain:

  • Thought capture pipeline (natural language to classify to store to confirm)
  • Personal Brain domain schema (7th pre-built schema)
  • alma init --open-brain interactive CLI setup
  • Memory migration from Claude, ChatGPT, Obsidian, Notion
  • Multi-client MCP protocol (concurrent access from any AI tool)

v1.0.0 — Open Brain:

  • Weekly review synthesis (pattern detection, connection finding)
  • Confidence-based routing with fix flow
  • Operating modes (always-on / scheduled / session-based)
  • Full documentation site with 45-minute tutorial
  • Temporal reasoning (time-aware retrieval)

Troubleshooting

See GUIDE.md for detailed troubleshooting. Quick fixes for common issues:

Common Issues

ImportError: sentence-transformers is required

pip install alma-memory[local]

pgvector extension not found

CREATE EXTENSION IF NOT EXISTS vector;

Embeddings dimension mismatch

  • Ensure embedding_dim in config matches your embedding provider
  • Local: 384, Azure text-embedding-3-small: 1536

Debug Logging:

import logging
logging.getLogger("alma").setLevel(logging.DEBUG)

Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

For questions, support, or contribution guidelines, email [email protected].

What we need most:

  • Documentation improvements
  • Test coverage for edge cases
  • Additional LLM provider integrations (Ollama, Groq)
  • Frontend dashboard for memory visualization

License

MIT


Support the Project

If ALMA helps your AI agents get smarter:


Metric Value
Tests passing 1,682
Tests failing 0
Storage backends 7
Graph backends 4
MCP tools 22
Source files 107
Monthly cost (local) $0.00
Monthly cost (Supabase) $0.00 (free tier)
Time to first memory < 5 minutes
Vendor lock-in None

Your AI should not treat you like a stranger every morning. ALMA makes sure it never does again.

Every conversation makes the next one better.

Created by @RBKunnela

Reviews (0)

No results found