ALMA-memory
Persistent memory for AI agents - Learn, remember, improve. Alternative to Mem0 with scoped learning, anti-patterns, multi-agent sharing, and MCP integration.
ALMA - Agent Learning Memory Architecture
Your AI forgets everything. ALMA fixes that.
One memory layer. Every AI. Never start from zero.
pip install alma-memory — 5 minutes to persistent memory. Free forever on SQLite.
Documentation | Setup Guide | PyPI | npm
What Is ALMA?
ALMA is a Python library that gives AI agents permanent, searchable, compounding memory.
Every time you start a new Claude session, a new ChatGPT conversation, or spin up any AI agent — it starts from zero. Your context, your preferences, the solutions it already found, the mistakes it already made — gone. You repeat yourself hundreds of times a year.
ALMA sits between your AI and a database you control. Before every task, it retrieves what the agent learned from past runs. After every task, it stores the outcome. Over time, the system builds a growing knowledge base that makes every future conversation smarter than the last.
BEFORE TASK DURING TASK AFTER TASK
+----------------------------+ +-------------------------+ +---------------------------+
| Retrieve relevant memories | | Agent executes with | | Learn from outcome |
| - Past strategies | --> | injected knowledge | --> | - Success? New heuristic |
| - Anti-patterns | | from memory | | - Failure? Anti-pattern |
| - Domain knowledge | | | | - Always: Knowledge grows |
+----------------------------+ +-------------------------+ +---------------------------+
Every conversation makes the next one better.
It's not a service. It's a library. You install it, connect it to your own database (even a local SQLite file), and your AI agents start remembering.
Why Not Just Use Claude's Memory or ChatGPT Memory?
Claude Projects, ChatGPT Memory, and Gemini's context all have built-in memory. Here's what they can't do:
| What You Need | Built-in AI Memory | ALMA |
|---|---|---|
| Memory across different AIs | Locked to one platform. Claude doesn't know what you told ChatGPT. | One memory layer shared across every AI tool you use. |
| Memory that learns from outcomes | Stores conversations, not lessons. Doesn't know what worked vs. failed. | Tracks success/failure per strategy. Recommends what actually works. |
| Anti-pattern tracking | No concept of "what NOT to do." | Explicit anti-patterns with why_bad + better_alternative. |
| Scoped learning | All-or-nothing context window. | Agents only learn within defined domains (can_learn / cannot_learn). |
| Multi-agent knowledge sharing | Each assistant is isolated. | Agents inherit knowledge from senior agents. Teams share across roles. |
| Your data, your infrastructure | Stored on their servers. You can't export it. | Your database, your rules. SQLite, PostgreSQL, Qdrant, Pinecone — you choose. |
| Scoring beyond similarity | "Most recent" or basic relevance. | 4-factor scoring: similarity + recency + success rate + confidence. |
| Memory lifecycle | Grows until you manually prune. | Automatic decay, compression, consolidation, and archival. |
| Workflow context | Stateless between sessions. | Checkpoints, state merging, and scoped retrieval across workflow runs. |
The fundamental difference: Built-in AI memory stores conversations. ALMA stores intelligence — what worked, what failed, what to avoid, and why.
Week 1, basic retrieval works. Week 4, patterns emerge across sessions. Week 12, cross-domain connections surface automatically. Week 52, the system knows your work better than any single conversation ever could.
What ALMA Can Do
| Capability | Description |
|---|---|
| 5 Memory Types | Heuristics (strategies), outcomes (results), preferences (user constraints), domain knowledge (facts), anti-patterns (what not to do) |
| 7 Storage Backends | SQLite+FAISS, PostgreSQL+pgvector, Qdrant, Pinecone, Chroma, Azure Cosmos DB, File |
| 4 Graph Backends | Neo4j, Memgraph, Kuzu, In-memory — entity relationship tracking |
| 22 MCP Tools | Native Claude Code / Claude Desktop integration via stdio or HTTP |
| RAG Bridge | Enhance any RAG system (LangChain, LlamaIndex) with memory signals and feedback loops |
| Multi-Factor Scoring | Similarity + recency + success rate + confidence — not just vector distance |
| Multi-Agent Sharing | Hierarchical knowledge sharing with inherit_from + share_with |
| Memory Lifecycle | Decay, compression, consolidation, archival, verified retrieval |
| Workflow Context | Checkpoints, state merging, artifacts, scoped retrieval |
| Event System | Webhooks + in-process callbacks for real-time memory reactions |
| Domain Factory | 6 pre-built schemas: coding, research, sales, support, content, general |
| TypeScript SDK | Full-featured JavaScript/TypeScript client library |
| Scoped Learning | Agents only learn within defined domains — prevents knowledge contamination |
Installation
pip install alma-memory
ALMA needs a database to store memories. Follow the Setup Guide (GUIDE.md) for complete database setup instructions — covers every backend, written for all experience levels.
The simplest option uses SQLite — it runs locally on your machine with zero setup:
pip install alma-memory[local] # SQLite + FAISS + local embeddings — nothing else to install
For production or cloud-hosted memory:
pip install alma-memory[postgres] # PostgreSQL + pgvector
pip install alma-memory[qdrant] # Qdrant vector database
pip install alma-memory[pinecone] # Pinecone vector database
pip install alma-memory[chroma] # ChromaDB
pip install alma-memory[azure] # Azure Cosmos DB
pip install alma-memory[rag] # RAG integration (hybrid search, reranking)
pip install alma-memory[all] # Everything
TypeScript/JavaScript:
npm install @rbkunnela/alma-memory
Need help setting up a database? See GUIDE.md for step-by-step instructions — from local SQLite (zero infrastructure) to cloud PostgreSQL (free tier available). Written for all experience levels.
Quick Start
1. Create a config file
# .alma/config.yaml
alma:
project_id: "my-project"
storage: sqlite
embedding_provider: local
storage_dir: .alma
db_name: alma.db
embedding_dim: 384
2. Use ALMA in your code
from alma import ALMA
# Initialize from config
alma = ALMA.from_config(".alma/config.yaml")
# Before task: Get relevant memories
memories = alma.retrieve(
task="Test the login form validation",
agent="helena",
top_k=5
)
# Inject into your AI prompt
prompt = f"""
## Your Task
Test the login form validation
## Knowledge from Past Runs
{memories.to_prompt()}
"""
# After task: Learn from the outcome
alma.learn(
agent="helena",
task="Test login form",
outcome="success",
strategy_used="Tested empty fields, invalid email, valid submission",
)
# Next time: Helena remembers what worked
3. That's it
Every time Helena runs, she retrieves what worked before and learns from new outcomes. No manual prompt engineering. No copy-pasting from past conversations. The memory compounds automatically.
Five Memory Types
ALMA doesn't just store text. It categorizes knowledge into five types that serve different purposes:
| Type | What It Stores | Example |
|---|---|---|
| Heuristic | Strategies that worked | "For forms with >5 fields, test validation incrementally" |
| Outcome | Task results (success/failure) | "Login test succeeded using JWT token strategy — 340ms" |
| Preference | User constraints | "User prefers verbose test output, dark theme, Python 3.12" |
| Domain Knowledge | Accumulated facts | "Login uses OAuth 2.0 with 24h token expiry" |
| Anti-Pattern | What NOT to do | "Don't use sleep() for async waits — causes flaky tests. Use explicit waits instead." |
Anti-patterns are the feature no other memory system has. When your AI makes a mistake, ALMA records what went wrong, why it's bad, and what to do instead. Next time, it knows to avoid that path before it starts.
Multi-Agent Memory Sharing
Agents don't have to learn everything from scratch. Junior agents can inherit knowledge from senior agents:
agents:
senior_dev:
can_learn: [architecture, best_practices]
share_with: [junior_dev, qa_agent]
junior_dev:
can_learn: [coding_patterns]
inherit_from: [senior_dev]
# Junior dev retrieves memories — including senior's shared knowledge
memories = alma.retrieve(
task="Implement user authentication",
agent="junior_dev",
include_shared=True
)
Storage Backends
ALMA is a library, not a service. You choose where your data lives:
| Backend | Best For | Vector Search | Production Ready |
|---|---|---|---|
| SQLite + FAISS | Local development, offline | Yes | Yes |
| PostgreSQL + pgvector | Production, high availability | Yes (HNSW) | Yes |
| Qdrant | Managed vector DB | Yes (HNSW) | Yes |
| Pinecone | Serverless vector DB | Yes | Yes |
| Chroma | Lightweight local | Yes | Yes |
| Azure Cosmos DB | Enterprise, Azure-native | Yes (DiskANN) | Yes |
| File-based | Testing, CI/CD | No | No |
| Platform | Backend | Starting Cost |
|---|---|---|
| Your Laptop | SQLite+FAISS | $0.00 |
| Supabase | PostgreSQL+pgvector | $0.00 (free tier) |
| AWS / GCP / Azure | PostgreSQL, Qdrant, Pinecone, Cosmos DB | Varies |
| Self-hosted | PostgreSQL+pgvector | $5-10/mo |
Step-by-step database setup for every backend: See GUIDE.md
MCP Server Integration
Connect ALMA directly to Claude Code or Claude Desktop with 22 MCP tools:
python -m alma.mcp --config .alma/config.yaml
// .mcp.json (for Claude Code)
{
"mcpServers": {
"alma-memory": {
"command": "python",
"args": ["-m", "alma.mcp", "--config", ".alma/config.yaml"]
}
}
}
22 MCP Tools — retrieve memories, learn from outcomes, manage preferences, checkpoint workflows, consolidate memories, verified retrieval, compression, and more. Every ALMA feature accessible from Claude's tool system.
RAG Integration
Enhance any RAG framework with ALMA memory signals:
from alma import ALMA, RAGBridge, RAGChunk
alma = ALMA.from_config(".alma/config.yaml")
bridge = RAGBridge(alma=alma)
# Your RAG system retrieves chunks (LangChain, LlamaIndex, etc.)
chunks = [
RAGChunk(id="1", text="Deploy with blue-green strategy", score=0.85),
RAGChunk(id="2", text="Use rolling updates for zero downtime", score=0.78),
]
# ALMA enhances with memory signals — past success/failure data
result = bridge.enhance(
chunks=chunks,
query="how to deploy auth service safely",
agent="backend-agent",
)
Includes hybrid search (vector + keyword with RRF fusion), feedback loops for auto-tuning retrieval weights, and IR metrics (MRR, NDCG, Recall, MAP).
Graph Memory
Track entity relationships alongside vector memory:
from alma.graph import create_graph_backend, BackendGraphStore, EntityExtractor
backend = create_graph_backend("neo4j", uri="neo4j+s://...", username="neo4j", password="...")
graph = BackendGraphStore(backend)
extractor = EntityExtractor()
entities, relationships = extractor.extract(
"Alice from Acme Corp reviewed the PR that Bob submitted."
)
for entity in entities:
graph.add_entity(entity)
for rel in relationships:
graph.add_relationship(rel)
Four backends: Neo4j (production), Memgraph (streaming), Kuzu (embedded), In-memory (testing).
Event System
React to memory changes in real-time:
from alma.events import get_emitter, MemoryEventType
def on_memory_created(event):
print(f"Memory created: {event.memory_id} by {event.agent}")
emitter = get_emitter()
emitter.subscribe(MemoryEventType.CREATED, on_memory_created)
Supports webhooks with retry logic, HMAC signature verification, and 5 event types: CREATED, UPDATED, DELETED, ACCESSED, CONSOLIDATED.
Architecture
+-------------------------------------------------------------------------+
| ALMA v0.8.0 |
+-------------------------------------------------------------------------+
| HARNESS LAYER |
| +-----------+ +-----------+ +-----------+ +----------------+ |
| | Setting | | Context | | Agent | | MemorySchema | |
| +-----------+ +-----------+ +-----------+ +----------------+ |
+-------------------------------------------------------------------------+
| EXTENSION MODULES |
| +-------------+ +---------------+ +------------------+ |
| | Progress | | Session | | Domain Memory | |
| | Tracking | | Handoff | | Factory | |
| +-------------+ +---------------+ +------------------+ |
| +-------------+ +---------------+ +------------------+ |
| | Auto | | Confidence | | Memory | |
| | Learner | | Engine | | Consolidation | |
| +-------------+ +---------------+ +------------------+ |
| +-------------+ +---------------+ |
| | Event | | TypeScript | |
| | System | | SDK | |
| +-------------+ +---------------+ |
+-------------------------------------------------------------------------+
| CORE LAYER |
| +-------------+ +-------------+ +-------------+ +------------+ |
| | Retrieval | | Learning | | Caching | | Forgetting | |
| | Engine | | Protocol | | Layer | | Mechanism | |
| +-------------+ +-------------+ +-------------+ +------------+ |
+-------------------------------------------------------------------------+
| STORAGE LAYER |
| +---------------+ +------------------+ +---------------+ |
| | SQLite+FAISS | | PostgreSQL+pgvec | | Azure Cosmos | |
| +---------------+ +------------------+ +---------------+ |
| +---------------+ +------------------+ +---------------+ |
| | Qdrant | | Pinecone | | Chroma | |
| +---------------+ +------------------+ +---------------+ |
+-------------------------------------------------------------------------+
| GRAPH LAYER |
| +---------------+ +------------------+ +---------------+ |
| | Neo4j | | Memgraph | | Kuzu | |
| +---------------+ +------------------+ +---------------+ |
| +---------------+ |
| | In-Memory | |
| +---------------+ |
+-------------------------------------------------------------------------+
| INTEGRATION LAYER |
| +-------------------------------------------------------------------+ |
| | MCP Server | |
| +-------------------------------------------------------------------+ |
+-------------------------------------------------------------------------+
Configuration
# .alma/config.yaml
alma:
project_id: "my-project"
storage: sqlite # sqlite | postgres | qdrant | pinecone | chroma | azure | file
embedding_provider: local # local | azure | mock
storage_dir: .alma
db_name: alma.db
embedding_dim: 384
agents:
helena:
domain: coding
can_learn:
- testing_strategies
- selector_patterns
cannot_learn:
- backend_logic
min_occurrences_for_heuristic: 3
share_with: [qa_lead]
victor:
domain: coding
can_learn:
- api_patterns
- database_queries
inherit_from: [senior_architect]
For full backend configuration (PostgreSQL, Qdrant, Pinecone, Chroma, Azure, embedding providers): See GUIDE.md
Comparisons
ALMA vs Mem0, LangChain Memory, and Graphiti| Feature | ALMA | Mem0 | LangChain | Graphiti |
|---|---|---|---|---|
| Memory Scoping | can_learn / cannot_learn per agent |
Basic isolation | Session-based | None |
| Anti-Pattern Learning | why_bad + better_alternative |
None | None | None |
| Multi-Agent Sharing | inherit_from + share_with |
None | None | None |
| Multi-Factor Scoring | 4 factors (similarity + recency + success + confidence) | Similarity only | Similarity only | Similarity only |
| MCP Integration | 22 tools | None | None | None |
| Workflow Checkpoints | Full checkpoint/resume/merge | None | None | None |
| TypeScript SDK | Full-featured client | None | JavaScript wrappers | None |
| Graph + Vector Hybrid | 4 graph + 7 vector backends | Limited | Limited | Graph-focused |
| Memory Consolidation | LLM-powered deduplication | Basic | None | None |
| Event System | Webhooks + in-process callbacks | None | None | None |
| Domain Factory | 6 pre-built schemas | None | None | None |
The key difference: Most solutions treat memory as "store embeddings, retrieve similar." ALMA treats it as "teach agents to improve within safe boundaries."
Release History
v0.8.0 - RAG Integration Layer- RAG Bridge: Accept chunks from any RAG framework and enhance with memory signals
- Hybrid Search: Vector + keyword with RRF fusion
- Feedback Loop: Track and auto-tune retrieval weights
- IR Metrics: MRR, NDCG, Recall, Precision, MAP
- Cross-Encoder Reranking: Pluggable reranking pipeline
- Memory Decay: Time-based confidence decay
- Memory Compression: LLM + rule-based summarization
- Verified Retrieval: Two-stage verification pipeline
- Retrieval Modes: 7 cognitive task modes
- Trust-Integrated Scoring, Token Budget, Progressive Disclosure
- 6 new MCP tools for Memory Wall
- Archive System: Soft-delete with recovery
- Embedding Performance Boost: 2.6x faster via batched processing + LRU cache
- Storage Backend Factory, Consolidation Strategies, Standalone Dedup Engine
- Checkpoint & Resume workflow state
- State Reducers for parallel agent states
- Artifact Linking to workflows
- Scoped Retrieval by workflow/agent/project
- 8 MCP Workflow Tools
- TypeScript SDK v0.6.0 with full workflow API parity
- Qdrant, Pinecone, Chroma backends
- Graph Database Abstraction (Neo4j, Memgraph, Kuzu, In-memory)
- Testing Module (MockStorage, MockEmbedder, factories)
- Memory Consolidation Engine
- Event System (Webhooks + callbacks)
- TypeScript SDK initial release
- Multi-Agent Memory Sharing
See CHANGELOG.md for the complete history.
Roadmap
v0.9.0 — Personal Brain:
- Thought capture pipeline (natural language to classify to store to confirm)
- Personal Brain domain schema (7th pre-built schema)
alma init --open-braininteractive CLI setup- Memory migration from Claude, ChatGPT, Obsidian, Notion
- Multi-client MCP protocol (concurrent access from any AI tool)
v1.0.0 — Open Brain:
- Weekly review synthesis (pattern detection, connection finding)
- Confidence-based routing with fix flow
- Operating modes (always-on / scheduled / session-based)
- Full documentation site with 45-minute tutorial
- Temporal reasoning (time-aware retrieval)
Troubleshooting
See GUIDE.md for detailed troubleshooting. Quick fixes for common issues:
Common IssuesImportError: sentence-transformers is required
pip install alma-memory[local]
pgvector extension not found
CREATE EXTENSION IF NOT EXISTS vector;
Embeddings dimension mismatch
- Ensure
embedding_dimin config matches your embedding provider - Local: 384, Azure text-embedding-3-small: 1536
Debug Logging:
import logging
logging.getLogger("alma").setLevel(logging.DEBUG)
Contributing
We welcome contributions! See CONTRIBUTING.md for guidelines.
For questions, support, or contribution guidelines, email [email protected].
What we need most:
- Documentation improvements
- Test coverage for edge cases
- Additional LLM provider integrations (Ollama, Groq)
- Frontend dashboard for memory visualization
License
MIT
Support the Project
If ALMA helps your AI agents get smarter:
- Star this repo - Helps others discover ALMA
- Buy me a coffee - Support continued development
- Sponsor on GitHub - Become an official sponsor
- Contribute - PRs welcome! See CONTRIBUTING.md
- Get help - Email [email protected] for support and inquiries
| Metric | Value |
|---|---|
| Tests passing | 1,682 |
| Tests failing | 0 |
| Storage backends | 7 |
| Graph backends | 4 |
| MCP tools | 22 |
| Source files | 107 |
| Monthly cost (local) | $0.00 |
| Monthly cost (Supabase) | $0.00 (free tier) |
| Time to first memory | < 5 minutes |
| Vendor lock-in | None |
Your AI should not treat you like a stranger every morning. ALMA makes sure it never does again.
Every conversation makes the next one better.
Created by @RBKunnela
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found