PAR — Persistent Agent Runtime

A self-hosted cognitive runtime for AI coding agents — memory, knowledge, coordination, and self-maintenance across any provider.

Your AI agent has memory now. But it's locked to one provider. Switch from Claude to Gemini and you start from zero. Use two agents on the same project and they can't share what they've learned. And nobody's cleaning up — memories pile up until you hit a wall.

PAR is a self-hosted MCP server that gives any agent — Claude, Gemini, GPT, or local models — a persistent brain that carries across providers, builds its own knowledge graph, coordinates multi-agent workflows, and maintains itself autonomously.

What You Get

┌───────────────────────────────────────────────────────┐
│  Any AI Agent (Claude · Gemini · GPT · Local)          │
└────────────────────────┬──────────────────────────────┘
                         │ MCP Protocol (standard)
┌────────────────────────▼──────────────────────────────┐
│  PAR Runtime (your hardware)                           │
│                                                        │
│  ┌───────────┐ ┌──────────┐ ┌───────────┐ ┌────────┐ │
│  │ Memory    │ │ Projects │ │ Knowledge │ │ Key-   │ │
│  │ (11 tools)│ │ + Tasks  │ │ Graph     │ │ Value  │ │
│  │ semantic  │ │ (6 tools)│ │ (5 tools) │ │ + Snip │ │
│  └───────────┘ └──────────┘ └───────────┘ └────────┘ │
│  ┌───────────┐ ┌──────────┐ ┌───────────┐ ┌────────┐ │
│  │ Agent     │ │ Events + │ │ Skills    │ │ Data-  │ │
│  │ Fleet     │ │ Workflows│ │ Library   │ │ sets   │ │
│  │ (4 tools) │ │ (7 tools)│ │ (4 tools) │ │(4 tool)│ │
│  └───────────┘ └──────────┘ └───────────┘ └────────┘ │
│  ┌───────────┐ ┌──────────┐ ┌───────────────────────┐ │
│  │ File      │ │ System   │ │ Autonomous Maintenance│ │
│  │ Index     │ │ Health   │ │ heartbeat · consolidn │ │
│  │ (3 tools) │ │ (4 tools)│ │ retention · triggers  │ │
│  └───────────┘ └──────────┘ └───────────────────────┘ │
│                                                        │
│  Local Embeddings optional · fallback mode by default │
│  56 tools · Self-hosted · Zero cloud dependency        │
└────────────────────────────────────────────────────────┘

Built-in History vs. PAR

	Built-in Memory	PAR
Switch providers	Start from zero — memory locked to one vendor	Everything carries over: Claude → Gemini → GPT → local
Data ownership	Provider's cloud, provider's rules	Your hardware, your data, forever
Search	Keyword or recent history	Semantic: "why did we pick SQLite over Postgres?"
Structure	Flat text blobs	Typed: decisions, insights, handoffs — filterable, taggable, pinnable
Knowledge	None	Auto-built entity graph with relationships across all memories
Multi-agent	Each agent is alone	Shared memory, events, coordinated workflows across agents
Self-maintaining	Memories pile up until you hit limits	Auto-consolidation, retention sweep, 15-min heartbeat
Extensible	Whatever the vendor ships	56 tools — add skills, datasets, workflows, event triggers

Quick Start

1. Clone and Configure

git clone https://github.com/latentfidelity/par.git
cd par
cp .env.example .env
# Edit .env — set your data directory and a strong MCP_AUTH_TOKEN

2. Deploy

docker compose up -d --build

3. Connect Your Agent

Add to your MCP client config (Cursor, Claude Desktop, etc.):

{
  "mcpServers": {
    "par": {
      "url": "http://localhost:3100/mcp"
    }
  }
}

For shared machines or anything reachable beyond your own loopback interface, keep
MCP_AUTH_TOKEN set and configure your MCP client to send
Authorization: Bearer <token>.

4. Verify

curl http://localhost:3100/health
# → {"status":"ok","server":"par-mcp","version":"7.0.0",...}

By default Docker publishes PAR on 127.0.0.1:${MCP_PORT:-3100} only. Set
PAR_BIND_ADDR deliberately if you need LAN or tailnet exposure.

Security Defaults

Non-health endpoints require bearer auth when MCP_AUTH_TOKEN is set.
Docker Compose binds the published MCP port to loopback by default.
Docker builds use npm ci against lockfiles for reproducible dependency installs.
Semantic embeddings are disabled by default because the previous ONNX/transformers
dependency chain carried vulnerable transitive packages. PAR uses deterministic
fallback embeddings until an audited local embedding runtime is enabled with
SEMANTIC_EMBEDDINGS=true.
Dependabot is configured for npm, Docker, and GitHub Actions update coverage.

Tools (56)

Core (2)

Tool	Description
`server_status`	Full system status with counts
`context_load`	One-shot project init (handoff + project + tasks + memories + KG)

Semantic Memory (11)

Tool	Description
`memory_store`	Store with semantic embedding (decision, insight, task, handoff, observation)
`memory_search`	Search by meaning, not keywords
`memory_search_advanced`	Compound filters (AND/OR/NOT, date ranges, tags, pinned)
`memory_log`	Chronological retrieval
`memory_timeline`	Natural language time queries ("show decisions from last week")
`memory_tag`	Add/remove/set tags on memories
`memory_pin`	Protect memories from auto-archival
`memory_unpin`	Remove pin protection
`memory_consolidate`	Cluster and distill similar memories (sleep consolidation)
`memory_retain`	Archive old memories with type-based protection
`memory_stats`	Dashboard of memory health

Knowledge Graph (5)

Tool	Description
`knowledge_extract`	Extract entity-relationship triples from text
`knowledge_query`	Query entities and traverse relationships
`knowledge_context`	Rich context for an entity with all connections
`knowledge_merge`	Merge duplicate entities
`knowledge_ingest`	Bulk-process memories to build the graph

Project Management (6)

Tool	Description
`project_register`	Register a project
`project_list`	List all projects
`project_get`	Get project details + open tasks
`task_add`	Add a work item
`task_list`	List tasks by project/status
`task_update`	Update task fields

Agent Fleet (4)

Tool	Description
`agent_register`	Register an agent with capabilities and preferences
`agent_list`	List all agents with status
`agent_get`	Get agent details
`agent_update`	Update status, increment stats

Events & Workflows (7)

Tool	Description
`event_trigger`	Emit events, match subscribers, auto-start workflows
`event_subscribe`	Subscribe agents to event patterns
`event_log`	Query event history
`workflow_register`	Define multi-step workflows with triggers
`workflow_run`	Start or advance workflow executions
`workflow_status`	Track step-by-step progress
`workflow_list`	List registered workflows

Knowledge Store (7)

Tool	Description
`meta_store` / `meta_retrieve` / `meta_list`	Key-value storage
`snippet_save` / `snippet_search` / `snippet_get` / `snippet_update`	Code snippet management

Skills (4)

Tool	Description
`skill_list`	List all registered skills
`skill_get`	Get skill details by ID
`skill_create`	Register a new skill
`skill_run`	Execute a skill script

Datasets (4)

Tool	Description
`dataset_register`	Register a dataset with metadata
`dataset_list`	List all datasets
`dataset_get`	Get dataset details
`dataset_search`	Search by tags

Epistemic Integrity (1)

Tool	Description
`memory_audit`	Detect contradictions, staleness, and orphaned knowledge — dry_run by default

Infrastructure (5)

Tool	Description
`system_health`	One-shot infrastructure audit
`system_changelog`	Human-readable activity summary
`file_index`	Project file tree with 24h cache
`file_store` / `file_read`	Remote file storage

Architecture

PAR is written in TypeScript with strict: true and compiles to ESM. The codebase is organized into 16 focused modules:

mcp/src/
├── server.ts             # Express + MCP transport
├── schedulers.ts         # Consolidation, retention, heartbeat timers
├── seed-defaults.ts      # Default skill templates
├── types.ts              # 14 domain interfaces
│
├── lib/
│   ├── storage.ts        # JSON I/O, path security, caching
│   ├── embedder.ts       # ONNX Runtime embedding + cosine similarity
│   ├── knowledge.ts      # KG entity extraction
│   └── epistemic.ts      # Contradiction, staleness, orphan detection
│
└── tools/
    ├── context.ts        # Shared types (MemoryEntry, loadMemoryIndex)
    ├── core.ts           # server_status, meta_*, file_*
    ├── registry.ts       # project/task/snippet/skill/dataset tools
    ├── memory.ts         # 11 memory tools + audit
    ├── knowledge.ts      # 5 KG tools
    ├── agents.ts         # agent fleet, system_health, memory_consolidate
    ├── events.ts         # event/workflow tools + changelog
    └── system.ts         # context_load, file_index, memory_retain, snippet_update

Tool modules use dependency injection for shared state (memory index getter/setter closures), avoiding global mutable variables. A Proxy wrapper in server.ts instruments all tool registrations with call-count telemetry.

Autonomous Maintenance

PAR maintains itself:

💓 Heartbeat — emits system.heartbeat every 15 minutes with uptime, memory count, heap usage
🌙 Memory Consolidation — every 6 hours, clusters similar memories and distills them (threshold: 200+ active)
🗑️ Retention Sweep — every 6 hours, archives memories older than 90 days (protects decisions and handoffs)
🧠 Epistemic Audit — every 6 hours, scans for semantic contradictions, stale claims, and orphaned knowledge graph entities
⚡ Store-time Warnings — memory_store checks incoming memories against top-5 similar existing memories for version/numeric conflicts
🔔 Event-Driven Workflows — deploy.complete auto-triggers post-deploy QA; maintenance.requested triggers memory maintenance
POST /trigger — HTTP endpoint for cron-based automation (no MCP session needed)

Configuration

Environment Variables

Variable	Default	Description
`MCP_PORT`	`3100`	Gateway port
`META_DIR`	`/data/meta`	Persistent storage directory
`MCP_AUTH_TOKEN`	(none)	Optional bearer token for authentication
`MCP_CORS_ORIGINS`	`localhost:3100`	Comma-separated allowed CORS origins

Data Storage

All state persists in META_DIR as flat JSON files:

/data/meta/
├── memory/        # Semantic memories with embeddings
├── knowledge/     # Entity-relationship graph
├── projects/      # Project registrations
├── tasks/         # Work items
├── kv/            # Key-value pairs
├── snippets/      # Code snippets
├── skills/        # Registered skills
├── datasets/      # Dataset registry
├── agents/        # Agent registrations
├── events/        # Event log
├── workflows/     # Workflow definitions
├── workflow_runs/ # Workflow execution tracking
├── experiments/   # ML experiment logs
├── artifacts/     # Versioned artifacts
└── files/         # General file storage

Backups

Use the included backup script:

# Manual backup
./backup.sh

# Automated daily (add to crontab)
0 3 * * * /path/to/par/backup.sh >> /path/to/par/backups/cron.log 2>&1

How Memory Works

Store: When you store a memory, PAR generates a 384-dimensional embedding using all-MiniLM-L6-v2 running locally on CPU
Search: Queries are embedded and compared via cosine similarity against all stored memories
Consolidate: Similar memories are automatically clustered and distilled into summaries (like sleep consolidation)
Knowledge Graph: Entities and relationships are auto-extracted from memories, building a queryable graph
Zero cost: The embedding model runs locally — no API calls, no cloud dependency

Limitations

PAR treats memories as orientation, not gospel. The agent should always verify stored context against the live codebase before acting on it.

Current gaps:

No memory versioning — you can't see how a decision evolved over time
In-memory index — the embedding index rebuilds on restart; works fine at thousands of memories, unclear at millions
Single node — no clustering or replication; designed for a single self-hosted machine

Roadmap

Shipped

TypeScript migration — strict mode, 16 modules, zero compilation errors
Tool modularization — 55 tools extracted from 3,594-line monolith into 7 focused groups
Discord adapters — agent fleet connects to team chat via bot framework
Dataset registry — register, search, and manage training datasets (4 tools)
Experiment tracking — log ML experiments with built-in experiment-runner skill
Snippet versioning — store, search, and update reusable code/prompt templates (4 tools)
Epistemic integrity — 3-layer detection engine (contradictions, staleness, KG orphans) with store-time warnings
Staleness detection — flag memories with outdated version claims or numeric counts superseded by newer data
Conflict detection — surface contradictory memories before the agent acts on stale context

Memory versioning — track how decisions evolve across conversations
Pluggable embeddings — swap all-MiniLM-L6-v2 for larger models when hardware allows

Later

Multi-node — replicate state across machines for teams
Persistent vector index — avoid full index rebuild on restart for large memory stores
Webhook integrations — push event notifications to external services (Slack, HTTP endpoints)

Non-goals

Cloud hosting — PAR is self-hosted by design
GUI dashboard — agents are the interface
Framework lock-in — PAR uses standard MCP, not a custom SDK

License

MIT

par