The agent-first, event-sourced knowledge runtime for AI coding sessions.

What is OpenEmpiric?

OpenEmpiric (OEM) is a local-first, agent-first learning runtime. It acts as long-term memory for your AI coding agents, capturing crucial architectural decisions, experiments, tradeoffs, failures, and outcomes directly from your developer-agent conversations. It stores this knowledge in an event-sourced ledger and structures it as a local knowledge graph, making it immediately available to guide future coding sessions.

Instead of writing and updating guidelines manually, you simply work with your agent. OEM automatically builds your project's memory map, preventing your agent from making the same mistake twice.

How it Works

OpenEmpiric shifts the model of AI coding sessions from a simple wrapped process to an active knowledge infrastructure. It is best explained through two layers: the user mental model (why it exists) and the internal runtime flow (how it works).

1. User Mental Model (Why OEM Exists)

Rather than just receiving static prompts, a highly productive AI coding agent relies on three balanced pillars:

Developer Intent: The immediate task, constraints, and goals you provide.
Project Workflows (e.g., AGENTS.md / CLAUDE.md): Guidelines detailing how the project is structured, run, and styled.
Project Memory (e.g., .oem/): The persistent knowledge layer managed by OpenEmpiric, recording what has been learned (prior decisions, resolved failures, validated designs).

graph TD
    Intent["Developer Intent<br>(What to do)"] --> Agent("Coding Agent")
    Workflows["Project Workflows<br>(How to work - e.g., AGENTS.md)"] --> Agent
    Memory["Project Memory (.oem/)<br>(What we've learned)"] --> Agent

2. Internal Runtime & Tooling Flow (How OEM Works)

The .oem/ directory at the project root acts as the center of gravity. During execution, the coding agent actively queries this memory infrastructure using MCP tools rather than just reading static startup files.

graph TD
    Developer(["Developer"]) -->|"Runs oem run agent"| OEM["OpenEmpiric Runtime"]
    OEM -->|"1. Restore State"| Folder[(".oem/ Project Memory")]
    Folder -.->|"Injected Context"| Agent("Coding Agent")
  
    Developer ---|"2. Normal Work Session"| Agent
  
    Agent ---|"3. Active MCP Queries<br>(knowledge_search, explain_concept, etc.)"| Folder
  
    Agent -->|"4. Exits"| OEM
    OEM -->|"5. Reflects Transcripts & Diffs"| Ledger[(".oem/ Event Ledger")]
    Ledger -->|"Updates"| Registry["concept_registry.json"]
    Ledger -->|"Generates"| Wiki[".oem/wiki/ Concept Wiki"]
    Registry -->|"Sync"| Folder
    Wiki -->|"Sync"| Folder

Key Features

🧠 Zero-Config Agent Memory: No manual updates or prompt-engineering required. OEM learns continuously in the background.
🔄 Automatic Reflection: Analyzes chat transcripts and file diffs to automatically extract new concepts, decisions, experiments, and root causes of failures.
🛡️ Secure File System Guards: SFS wrapper protects the host workspace with strict path-traversal limits and truncation protection (prevents agents from accidentally erasing large files).
🔎 Local RAG Retrieval: Perform hybrid vector-BM25 search queries across your project's memory repository locally.
📚 Materialized Wiki: Auto-generates clean, human-readable markdown documentation in .oem/wiki/ as the knowledge graph evolves.

Quick Start

1. Install Globally

Install the unified oem CLI runtime globally using uv with semantic retrieval support:

uv tool install "git+https://github.com/xpajonx/openempiric.git#subdirectory=packages/oem-knowledge[semantic]"

For a lighter BM25-only install, you can omit [semantic], but the default user path assumes semantic retrieval is available when possible.

2. Setup Agent Integration

For terminal-based workstation environments (like OpenCode):

oem setup opencode

For desktop-based non-terminal environments (like Codex App):

oem setup codex-app

3. Launch a Session

From any project directory, launch a managed session:

mkdir demo-project
cd demo-project
oem run opencode

oem run opencode bootstraps the project-local .oem/ memory folder automatically and starts a managed session.

(Note: For Codex App, standard MCP tools are configured automatically. The agent uses the knowledge_session_end conversational tool to commit and persist learnings at the end of a session).

4. Diagnose Problems

Verify your workspace integration and bridge health:

oem doctor

Supported Agents & Custom Adapters

OpenEmpiric officially supports the following agent runtimes out-of-the-box:

OpenCode: Workstation-level integration, native plugins, and session supervisor for terminal environments.
Grok: xAI Grok Build TUI/CLI support via oem run grok and oem setup grok (MCP + rules).
Antigravity: Terminal co-pilot and command-line companion integration.
Codex App: First-class MCP-based non-terminal desktop runtime, configured automatically via the WSL bridge architecture (oem setup codex-app).

Extensibility: Write Your Own Adapter!

You can easily extend OpenEmpiric to support other environments (such as Claude Code, Cursor, or your own proprietary CLI agent). All you need to do is subclass the base adapter:

from oem_knowledge.adapters.base import BaseAdapter
from oem_knowledge.adapters.registry import register_adapter

@register_adapter("my-custom-agent")
class MyCustomAgentAdapter(BaseAdapter):
    def verify_mcp(self) -> bool:
        # Check if agent is installed and configured
        return True

    def parse_transcript(self, transcript_path) -> str:
        # Extract dialogue text from agent logs
        return transcript_path.read_text()

Refer to the Adapter Architecture Guide and Adapter Specification for details.

The Agent Lifecycle Model

The core mental model of OpenEmpiric revolves around the 5-step agent lifecycle contract. Agents must understand and transition through these steps:

knowledge_session_start
 └─► knowledge_read (Orientation & Learning)
      └─► knowledge_search (Specific Retrieval)
           └─► knowledge_reflect (Capturing Decisions & Failures)
                └─► knowledge_session_end (Finalize & Commit)

knowledge_session_start — Begin or restore a managed workspace session.
knowledge_read — Learn and orient from broad project context, background, recent sessions, and conventions (use at start or when orientation is needed).
knowledge_search — Query the vector/keyword database for a specific project memory or concept.
knowledge_reflect — Record structured events (hypotheses, experiments, validations, decisions, failures) during development.
knowledge_session_end — Finalize, generate session report, update derived indexes, and close/commit the session.

Recommended `.gitignore` Default Policy

To keep project repositories clean while preserving valuable knowledge, follow this default .gitignore configuration:

# Commit-friendly project memory
# (Do NOT ignore these)
! .oem/wiki/**
! .oem/concept_registry.json
! .oem/skills/**

# Local runtime state and caches
# (Usually ignore these)
.oem/state/**
.oem/reports/**
.oem/backups/**
.oem/**/*.sqlite
.oem/**/*.db
.oem/.cache/**
.oem/.runtime/**

# Project-policy dependent (may contain private agent transcript traces)
.oem/runtime_events.jsonl
.oem/outcomes.jsonl

CLI Command Reference

Expected Public CLI Surface (v1.0 Frozen)

Command	Category	Description
`oem init`	User	Initialize the `.oem/` memory repository in the current workspace.
`oem setup opencode`	User	Configure and register OpenCode workstation-level integration.
`oem setup grok`	User	Configure and register Grok (MCP + .grok integration).
`oem setup codex-app`	User	Configure and register Codex App bridge integration.
`oem run opencode`	User	Launch a managed OpenCode agent session with dynamic config injection.
`oem run grok`	User	Launch a managed Grok agent session with dynamic context + transcript capture.
`oem doctor`	User	Verify workspace health, plugin links, and agent integration state.
`oem read`	User	Read the project memory baseline based on scope (`project`, `recent`, `skills`, `health`).
`oem search`	User	Search the project knowledge base with keyword or hybrid search.
`oem reflect`	User	Dry-run reflection and concept extraction from conversation transcript.
`oem session-start`	Internal	Restore pre-injection context and prepare workspace before agent run.
`oem session-end`	Internal	Finalize context, run extraction, and commit learnings.
`oem index`	Advanced	Rebuild derived search index for the project.
`oem clean`	Advanced	Analyze or apply safe `.oem` cleanup actions (dry-run supported).
`oem recover`	Advanced	Recover, commit, or abort crashed or unfinished agent sessions (dry-run supported).
`oem mcp`	Advanced	Start the background FastMCP server.
`oem skills`	Advanced	Review, list, approve, or reject skill candidates.

Note on other commands: Commands like health, config, merge, rebuild, events, explain, identity, concept, contradictions, outcome, todo are classified as internal, advanced, hidden, deprecated, or post-v1.0 and are suppressed in help outputs.

MCP Tool Reference

The OpenEmpiric MCP server exposes the following tools categorized by target role:

1. Core Lifecycle Tools (Must Remain Stable)

knowledge_session_start — Start or restore an OpenEmpiric session lifecycle.
knowledge_read — Read the project memory baseline for orientation.
knowledge_search — Fast lookup and search across concepts.
knowledge_reflect — Extract structured events and write a session report.
knowledge_session_end — End session, commit learnings, and update indexes.

2. Core Tools

knowledge_init — Bootstrap the .oem/ memory structure.
knowledge_index — Re-index markdown files in the project's concept tree.
knowledge_explain_concept — Explain concept details and recent evidence.
knowledge_stats — Show project memory stats (concepts count, DB size).

3. Advanced Tools

knowledge_materialize — Promote candidates to canonical status and write wiki markdown.
knowledge_health_check — Scan for duplicate concepts, staleness, or contradictions.
knowledge_get_events / knowledge_get_event — Retrieve events from the ledger.
knowledge_merge_concepts — Merge duplicate concepts.
knowledge_skill_candidates / knowledge_skill_candidate_show / knowledge_skill_candidate_approve / knowledge_skill_candidate_reject / knowledge_skill_candidate_defer — Review and promote skill candidates.

4. Optional / Tasks / Telemetry Tools

oem_todo_read / oem_todo_write / oem_todo_advance — Tasks/todos management.
knowledge_usage_report — Experimental telemetry report.

5. Deprecated Tools

knowledge_session_commit — Deprecated alias; forwards to session_end and returns warning.

How It Learns: High-Value Cues

OEM works automatically, but it extracts the highest-quality knowledge when developers state reasoning and results explicitly during a conversation.

Concept Signal	Better (OEM Captures Rationale)	Worse (Vague/Context-Free)
Decisions	"We decided to use TypeScript because Python startup latency caused MCP timeouts."	"Use TypeScript."
Failures	"The pagination job failed because the pagination cursor was not reset between retries."	"Pagination is broken."
Tradeoffs	"We chose client-side caching to avoid Redis dependency, accepting up to 5 minutes of stale data."	"Use client-side cache."
Experiments	"We tested BM25 vs hybrid search. Hybrid scored 23% higher on recall@5, so we made it default."	"Hybrid search works better."
Outcomes	"Restructuring the DB index reduced retrieval latency from 5.5s to 450ms."	"Looks faster now."

Check out the Best Practices Guide for more details.

Repository Anatomy

packages/oem-knowledge — Core RAG logic, SQLite event database, extraction services, and CLI.
plugins — Native TypeScript plugins for IDE/workstation agent integrations.
docs — Complete specifications, architecture details, and lifecycle logs.