🌍 This README is in English (the open-source lingua franca). Running an AI agent? Ask it to read this in your language — eating our own dog food. 🐙

Octorato

an Octorato (n.) — an organic, file-native AI agent: one brain, many sealed arms. The same wall that seals a client is the wall that bills them.

_{the Octopus Brain Framework}

Octorato is an open-source AI agent operating system: one file-native "brain" — 210+ skills and 160+ specialist agents in plain markdown under git — that one operator runs across many isolated client "arms."

📄 White paper: Octorato — An Organic, File-Native Model of Artificial Agency · 🌐 Live: dataqbs.com/octorato · 📣 Launch article · 🛠️ Built with Octorato · 📘 dataqbs on Facebook

🧑‍💻 New here? Start Here → contributing guide. Grab a good first issue, see where we're headed in the ROADMAP, or shape the architecture in the RFCs. Newcomers welcome — we credit every contributor. 🐙

An Octorato is an organic, file-native AI agent OS — and because its arms are sealed cells, it has built-in FinOps.
The brain consultants and small agencies need to bill clients fairly — and land on the right side of the
Gartner prediction that 40% of agentic AI projects will be canceled by 2027
over unmanaged cost.

Honest scope: per-client cost is an estimate from local session logs at list price, attributed by repo path (a small unattributed remainder is expected). The budget halt is real code — budget-check.py exits 2 and a PreToolUse hook refuses the tool — but it arms itself only once you configure budgets.yaml. The mechanism is real; the precision is opt-in, and we say which is which.

"One brain. Sealed arms. One ledger per client — because the arm IS the ledger."

🚀 Built with Octorato — live, in production

Octorato isn't a demo — it ships real software. A few of the products this brain built and maintains (full list in the showcase):

Product	Live
Trilingual Astro/Cloudflare site + RAG chatbot	dataqbs.com
Multi-Reach — compose once, publish across 6 social channels	/multi-reach
White-label real-estate catalog w/ daily FB auto-publish	/realestate
Open Garage — commission-free marketplace, direct WhatsApp	/open-garage
AI persona bot — answers as the operator (RAG + dynamic PDF)	/carloscarrillo
Daily AI-news blog + curated news surface	/blog · /news

→ Want to build things like these? Start Here — newcomers welcome, every contributor credited. 🐙

🚀 Built with Octorato — live
Why now: the token economy is here
What makes Octorato different
What it is
FinOps roadmap
Daily Self-Growth
Why an Octopus?
Migrating from dotclaude (May 2026)
Quick Start
Architecture — CLASS / OBJECT / ARM
The 4D Paradigm — The Nervous System
4D+S — Spec-Driven Development Integration
The Corporation
The Connectome — Neural Architecture
Client Arms — Total Isolation
Org Chart — 13 Divisions, 160+ Agents
Synapses — The Skill Layer
Memory — Hippocampus and the Working Set
Reflexes — The Spinal Cord Layer
Observability — The Sensory Cortex
Enforcement Scripts
MCP Servers — The Action Space
Multi-Tool Support
Multi-Machine Sync — The Glial Layer
Repository Structure
10x Roadmap
Contributing
License

Why now: the token economy is here

The AI industry is splitting into three billing primitives —
tokens (Anthropic, OpenAI),
steps (AWS Bedrock AgentCore),
and outcomes (Salesforce Agentforce, ~$2/conversation).
Anthropic announced a June 2026 enterprise pricing shift
moving Claude / Claude Code / Cowork to per-token pass-through.
a16z's State of AI reports OpenRouter crossed >100T tokens/year in 2025
with agentic workloads burning 5–30× more tokens than chatbots.
BCG estimates a $200B agentic TAM
in tech services and recommends outcome-based pricing for B2B SaaS.
The FinOps Foundation's 2026 State of FinOps lists AI FinOps as the #1 mandate.

Enterprises need governance. Solo consultants and small agencies need it
more — they're invoiceable for the burn, not absorbing it on a runway.

Octorato is the open-source FinOps brain for that segment. Larger teams use the same primitives at higher cardinality.

What makes Octorato different

Layer	Crowded by	Octorato's wedge
Agent frameworks	LangGraph, CrewAI, AutoGen, LlamaIndex	We don't compete here — Octorato is an OS, not a framework. Bring your own.
Agent observability	LangSmith, Langfuse, Arize, Datadog LLM Obs	Complementary. Octorato emits OpenInference-style spans; sits above your observability stack as the governance layer.
FinOps for AI Agents	greenfield (Vantage, Amnic, Finout fighting for category, no Gartner MQ yet)	The only one of these that ships per-client isolation + cost ledger + budget halt as open-source files — because the arm is both the security cell and the billing line item. We don't claim to lead a quadrant; we occupy an intersection no one else does.
Compute sandboxes	e2b.dev	Complementary (arms can run in e2b sandboxes).
Operator brains / `~/.claude` distributions	ECC (`affaan-m/ECC`), dotclaude variants, claude-flow, wshobson/agents collections	Most are one bag of skills. Octorato is an OS with multi-tenant isolation: per-client arms, per-client cost ledger, per-client budget caps. We learn from peer brains daily (`repo-watch` skill) without absorbing their multi-tenancy gap.

Three things competing observability tools don't have:

Per-arm cost attribution — every trace event tags the client (arm), and skill-cost-profiler.py produces a billable cost rollup per project, per month, per skill, with USD applied via the shared _pricing.py table. Privately. On your filesystem. No SaaS dependency.
Sealed multi-tenancy — clients live in sealed repos (software-level isolation — no shared state, no cross-arm reads). The brain sees their cost data read-only; the arms never see each other. Datadog can't enforce this; LangSmith Cloud isn't designed for it.
Budget caps that actually halt agents — budget-check.py reads budgets.yaml, computes month-to-date spend per arm, and exits with code 2 when the cap is burned through. A PreToolUse hook wires that into Agent / subagent / browser tools so the operator can't accidentally torch a client's budget. CFO buy signal, not telemetry buy signal.

FinOps is the wedge. The architecture under it is biology — because the same problem the operator faces (one consciousness, many client workspaces, no cross-contamination) is the problem an octopus solves with eight semi-autonomous arms. The cost ledger and the neural map share the same substrate: per-arm isolation.

What it is

An open-source AI agent operating system where a single human operator directs a shared brain of specialist AI agents — across clients, projects, and machines — without ever mixing their data or their bills.

With nothing but natural language, you can direct a team of AI specialists to build and ship software, and bill the client honestly when it ships.

Live framework: 210+ skills, 160+ agent personas across 13 divisions, enforcement scripts, multi-machine sync, a neural connectome that learns over time, and a FinOps pipeline that tags every trace event with the client who incurred it — with per-arm USD rollup and a PreToolUse budget halt shipped, opt-in (configure budgets.yaml to arm caps; run the anthropic-enterprise-analytics pull to reconcile estimate against billed cost). See roadmap below.

Shipped with it: live products built and maintained agent-first on this brain — see Built with Octorato.

https://github.com/CarlosCaPe/octorato

Octorato = octopus + tesseract — eight-armed brain in a 4D activation space (Agent × Skill × Arm × 4D-phase).

FinOps roadmap

Trace capture per skill / agent / phase (scripts/trace-hook.py + 8 hook points)
Daily brain digest with cost section (scripts/brain-digest.py via cron)
Skill-level cost profiler 30-day window (scripts/skill-cost-profiler.py)
SLO + watchdog infrastructure (success_rate SLI)
Per-event arm tagging (trace-hook.py reads cwd → client id)
Per-arm cost rollup + USD conversion (scripts/_pricing.py + skill-cost-profiler.py aggregates by arm, digest renders the table)
Cost-spike watchdog (watchdog.py z-score over tokens/day per skill·arm against 30d baseline; floor at 100k tokens to avoid noise)
Budget caps + PreToolUse hard-stop hook (scripts/budget-check.py reads budgets.yaml, exit 2 = halt; see finops-budget-policy)
Anthropic Enterprise Analytics API ingest (scripts/anthropic-analytics-pull.py reconciles estimated vs billed; see anthropic-enterprise-analytics)
Claude Cowork integration shape — quarantined pseudo-arm cowork-shared, never mounts a client arm directory (design). Enforcement hook deferred until Anthropic publishes the Cowork session-event API surface; Cowork billed cost is already captured today via the Admin Analytics ingest.

See the biology section below for why the architecture takes this shape.

Daily Self-Growth

The brain grows itself. Every day a scheduled loop scans GitHub Trending, Hacker News, and Product Hunt for new tools, runs each candidate through a deterministic brain-fit classifier plus an LLM quality gate, and auto-promotes the survivors that clear the bar into real skills — then publishes what it learned.

Discover → github-trending-curation pulls multi-source trending, dedupes against the existing connectome (TF-IDF cosine), and tags each candidate with an integration action: ADD / MERGE-WITH / REPLACE / EXTEND / SKIP. The point is harmonization, not accretion — the brain is a connected graph, not a pile of skills.
Watch peers → repo-watch is the targeted sibling of trending: a curated 7-repo daily monitor (competitors, peer brains, upstream Claude Code projects) that classifies each day's diff as HIGH / LOW / EMPTY / BASELINE signal and drops a file-based trigger into knowledge/repo-watch/triggers/ for repo-deep-learn to pick up out-of-band. Detection state ≠ action state — the cron stays fast and the analysis stays deliberate.
Decide → an LLM QA gate drops low-value noise; only net-new ADD candidates auto-apply (structural MERGE/REPLACE/EXTEND are left for human review).
Grow & publish → survivors become skills/<name>/SKILL.md, a changelog article on the public /news feed (crediting the source repo — it's a community to grow with), and a social post. Every day's decisions — added, deferred, and ignored-with-reason — are appended to a single audit ledger (knowledge/github-trending/HISTORY.md) so the operator can scroll the whole history and challenge any call.

No daily human validation required: the AI tooling landscape moves faster than any one person can review, so the operator audits the ledger on their own cadence instead of gatekeeping every item.

Why an Octopus?

This isn't a metaphor we forced onto the software. The software emerged from studying how Octopus vulgaris actually works — and discovering that its neural architecture solves the exact problem we face with AI agents.

The Biology

An octopus has approximately 500 million neurons. For context, a dog has roughly 530 million in its cerebral cortex alone (and about 2 billion total in its brain). But here's what makes the octopus extraordinary: two-thirds of its neurons live in the arms, not the central brain.

Each arm can:

Taste and smell independently (each sucker has chemotactile receptors — van Giesen, Kilian, Allard & Bellono, Cell 2020 — work performed in Octopus bimaculoides)
Execute local reflexes and stereotyped reaching motions without consulting the brain (Sumbre et al., Science 2001 — note: isolated arms perform programmed motor patterns, not contextual decision-making)
Coordinate with the central brain for complex tasks
Operate with high autonomy from other arms (peripheral nerve cords provide some inter-arm communication, but each arm has its own local control)

Beyond the arms, the octopus has:

Chromatophores — tens of thousands of individually innervated color cells that allow real-time pattern changes in under a second
A vertical lobe — the primary learning center, where ~25 million amacrine cells converge onto ~65,000 efferent neurons (a biological dimensionality reduction system)
Autotomy — the ability to voluntarily detach an arm under threat and fully regenerate it
Extensive mRNA recoding — A-to-I RNA editing that modifies over 13,000 protein-coding sites, reshaping neural protein function in response to environmental conditions

The central brain sets high-level intent. The arms execute with local intelligence. Information flows up (arm discoveries reach the brain) and down (brain strategies reach the arms). In biology, some peripheral inter-arm communication exists — but in our software, we enforce total sideways isolation as a deliberate design choice for client data security.

The full biology-to-software mapping table (with ML-accuracy notes on each analogy) lives in wiki Architecture §9.

The 8 and the Tesseract

Two symbols sit behind the name. Both are mathematical.

The 8 → ∞. An octopus has eight arms. Rotate the 8 ninety degrees and it becomes ∞ — the lemniscate. Octorato is built for an unbounded number of sealed arms because the brain distributes only generic knowledge downward and arms never see each other. Multi-tenancy without ceiling. The 8 is symbolic; the ∞ is the engineering claim.

The Tesseract → 4D. The 4D Paradigm — Describe → Delegate → Diligent → Disclose — is named 4D on purpose. A tesseract is the 4-dimensional analog of a cube. The four phases are not sequential steps but dimensions, active simultaneously in every action. Working inside the brain is working in 4-space, and from there shaping outcomes in 3-space: the codebase, the deliverable, the invoice. The 4D is not a workflow checklist; it is the control plane.

And the 4D doesn't run once — it runs in a WHILE. Each response ends with a one-line Provenance footer (Basis · Engine · Touched · Verified): the brain sensing its own action — proprioception. Reading it is the loop condition (anything open? did what I touched match what I meant?) and the trigger of the next beat. A human can't be in ten places at once; Octorato is the vehicle that lets one operator inhabit that dimension — many sealed arms acting in parallel under one brain. The tesseract you can't perceive, Octorato lets you live in.

The metaphor and the engineering are the same thing. Full reference: skills/octorato-symbolism/SKILL.md.

Migrating from dotclaude (May 2026)

The repo was renamed from dotclaude → octorato. If your laptop's ~/.claude/ still has origin pointing to the deleted dotclaude repo, one of these options will fix it:

Option A — automatic (run once per laptop):

bash ~/.claude/scripts/migrate-octorato.sh

Option B — manual one-liner:

git -C ~/.claude remote set-url origin https://github.com/CarlosCaPe/octorato.git

After either, ai-pull / ai-push work normally. The Windows ai-pull.ps1 / ai-push.ps1 scripts self-heal on next run — no manual step needed there once they're updated.

Quick Start

# 1. Clone the brain
git clone https://github.com/CarlosCaPe/octorato.git ~/.claude

# 2. Create your private company brain
cp -r ~/.claude/templates/company/ ~/.claude/company/
mv ~/.claude/company/COMPANY.md.template ~/.claude/company/COMPANY.md
nano ~/.claude/company/COMPANY.md

# 3. Create your first arm (client project)
mkdir -p ~/projects/my-client/.claude
cp ~/.claude/templates/arm/CLAUDE.md.template ~/projects/my-client/.claude/CLAUDE.md

# 4. Sync across machines
ai-pull    # on every workstation

See templates/ for annotated setup guides with {{PLACEHOLDERS}}.

Branching & contribution model

The brain uses staged promotion. All pull requests — contributors, day-to-day work, and bot-authored skills — target test, the integration branch where ideas are iterated and reviewed. master is the curated, public canonical and is promotion-only: it advances solely through a weekly, operator-reviewed test → master promotion (the /promote-test ritual).

PRs ─▶ test ──weekly /promote-test (reviewed)──▶ master (protected, public canonical)

Fork → branch off test → PR against test. Full guide: CONTRIBUTING.md. (The daily dataqbs.com content feed is the exception — it ships to its own repo's master daily; staging is for the brain.)

Architecture — CLASS / OBJECT / ARM

The framework uses an object-oriented inheritance model: BRAIN = CLASS (this public repo, the DNA) → COMPANY BRAIN = OBJECT (your private ~/.claude/company/, gitignored) → ARMS = PROPERTIES (isolated per-client repos that never see each other).

The brain ships the engine — 4D Paradigm, connectome, agents, skills, enforcement scripts, templates. The company brain holds your identity and arm definitions. Each arm holds one client's context, sealed.

The reactive control architecture (ECA atoms · Behavior-Tree priority · Statechart 4D · Bandit routing) that governs hook composition is in docs/architecture/hook-orchestration.md. Full CLASS/OBJECT/ARM anatomy, biology mapping, and information-flow rules: wiki Architecture.

The 4D Paradigm — The Nervous System

The 4D is not a checklist. It is the nervous system protocol — every signal in the octopus follows four phases. No exceptions.

1D Describe → state what and why before acting. 2D Delegate → search the connectome (Q1: who knows?), check for an API (Q2: MCP first), run delegate-check (Q3: who does it?). Change Gate → present a full manifest of every file to create/modify/delete, wait for explicit confirmation. 3D Diligent → validate with evidence (build/lint/test); FAIL means fix, not ship. 4D Disclose → scan the Impact Radius; every change radiates.

Think of it like terraform plan before terraform apply. The agent presents a Change Manifest and stops and waits. No fire-and-forget. This is a gate, not a suggestion.

The 4D runs in a WHILE — while (open work / Touched ≠ intent): 4D(). Each response ends with a Provenance footer (Basis · Engine · Touched · Verified): the brain's proprioception, and the loop condition for the next beat.

Full protocol, gate formats, validation matrix, enforcement scripts, and the WHILE loop: wiki The-4D-Paradigm · skills/4d-paradigm-protocol/SKILL.md.

4D+S — Spec-Driven Development Integration

For tasks above trivial complexity, the 4D integrates with a spec-driven workflow:

Score	Level	What Activates
0-2	TRIVIAL	4D only (no spec artifacts)
3-5	MEDIUM	4D + `plan.md` (task checklist feeds the Gate)
6+	LARGE	4D + full SDD: `feature.md` → `plan.md` → implement → `review.md` → archive

Complexity signals: +2 touches 4-10 files, +4 touches 10+, +2 new feature, +3 architecture decision, +5 user requests spec, +1 schema change, +1 new API.

The archived specs become institutional memory — future tasks reference past decisions.

The Corporation

                        ┌─────────────────┐
                        │   HUMAN         │
                        │   (Operator)    │
                        │   Human Gateway │
                        └────────┬────────┘
                                 │
                        ┌────────▼────────┐
                        │   BRAIN         │
                        │  ~/.claude/     │
                        │  210+ Skills     │
                        │  160+ Agents     │
                        │  N Client Arms  │
                        │  HOOKS — enforcement reflexes           │
                        │  (delegate · qa-merge · dimension-awareness) │
                        └────────┬────────┘
                                 │
            ┌──────┬──────┬──────┼──────┬──────┬──────┐
            ▼      ▼      ▼      ▼      ▼      ▼      ▼
         ARM 1  ARM 2  ARM 3  ARM 4  ARM 5  ARM 6  ARM N

The 3-Layer Activation Stack

Every task activates three layers simultaneously:

1. AGENT  = WHO       (persona, expertise, voice)
2. SKILL  = HOW       (technique, workflow, best practices)
3. ARM    = FOR WHOM  (client context, data, config)

Example: A client needs a database audit:

Brain activates Database Optimizer agent (WHO)
Loads explain-analyze-validation + index-creation-concurrently skills (HOW)
Operates within the client's arm context (FOR WHOM)
Result: specialist persona crafting idempotent DDL, scoped to this client only

Activation Modes

Mode	Trigger	Example
Auto	Brain detects task matches agent domain	Database query activates Database Optimizer
Manual	User says "activate [Agent Name]"	"Use Proposal Strategist for this RFP"
Combined	Agent + skills + arm context	Security Engineer + threat-model skill + client arm

The Connectome — Neural Architecture

The brain maintains a deep connectome — a real weighted graph auto-generated by reading the FULL content of every agent and skill file, vectorizing with TF-IDF, and computing cosine similarity across all pairs.

Inspired by octopus neurobiology: 500M neurons, 2/3 distributed in arms, extensive mRNA recoding that reshapes neural protein function.

  D1 (WHO)     D2 (HOW)     D3 (WHERE)           D4 (WHEN)
  ────────     ────────     ─────────             ─────────
  160+         200+         N                     4
  Neurons      Synapses     Regions               Phases
  (Agents)     (Skills)     (Arms + parallel      (4D Paradigm)
                             git-worktree
                             session dims
                             v3.1.0)

Architecture	Neuroscience	Function
Agents	Neurons	Processing units — WHO does the work
Skills	Synapses	Functional connections — HOW work gets done
Agent↔Agent	Neural Pathways	Collaboration channels — WHO works with WHO
Skill↔Skill	Skill Clusters	Capability families — related skills group
Arms	Brain Regions	Specialized areas — WHERE work happens
4D Phases	Action Potentials	Temporal signals — WHEN signals fire

Querying and generating

python3 ~/.claude/scripts/query_connectome.py query "deploy Svelte app to Cloudflare Workers"
python3 ~/.claude/scripts/generate_neural_map.py   # auto-runs on every ai-push

query_connectome.py builds a TF-IDF vector from the task, ranks every agent and skill by cosine similarity, and returns the best matches with scores — not just name/trigger matching, but full-content semantic similarity. generate_neural_map.py produces Agent↔Skill, Agent↔Agent, and Skill↔Skill weighted connections with Hebbian learning, hub detection, and gap detection. Rebuilds from scratch on every ai-push.

Client Arms — Total Isolation

Each arm is an isolated client project. Arms never see each other's data. Only the human operator can explicitly bridge knowledge between arms.

Information flows strictly one-way: generic lessons rise from arm to brain (anonymized), brain rules and skills descend to all arms, and arms never communicate sideways — only the human operator bridges knowledge between them. Full flow rules: wiki Architecture §5.

The Learning Cycle

1. ARM discovers pattern      → "This query fix reduced seq scans 10x"
2. HUMAN approves capture     → "Yes, make it a skill"
3. BRAIN stores as skill      → ~/.claude/skills/explain-analyze-validation/SKILL.md
                                 (anonymized: no client name, no table names, no data)
4. BRAIN distributes to ALL   → ai-push / sync-ai-docs
5. OTHER ARMS benefit         → Next project loads the skill automatically

Org Chart — 13 Divisions, 160+ Agents

graph TB
    classDef ceo fill:#0D1117,stroke:#58A6FF,stroke-width:3px,color:#C9D1D9
    classDef brain fill:#161B22,stroke:#8B949E,stroke-width:2px,color:#C9D1D9
    classDef div fill:#21262D,stroke:#30363D,stroke-width:1px,color:#C9D1D9,font-size:12px

    CEO["Human Operator"]:::ceo
    BRAIN["BRAIN — 210+ Skills · 160+ specialist agents · N Arms"]:::brain
    CEO --> BRAIN

    BRAIN --> ENG["Engineering — 28"]:::div
    BRAIN --> DES["Design — 8"]:::div
    BRAIN --> MKT["Marketing — 30"]:::div
    BRAIN --> SAL["Sales — 8"]:::div
    BRAIN --> PRD["Product — 5"]:::div
    BRAIN --> PM["Project Mgmt — 6"]:::div
    BRAIN --> TST["Testing — 8"]:::div
    BRAIN --> SUP["Support — 7"]:::div
    BRAIN --> SPC["Specialized — 29"]:::div
    BRAIN --> XR["Spatial — 6"]:::div
    BRAIN --> GMD["Game Dev — 20"]:::div
    BRAIN --> ACD["Academic — 5"]:::div
    BRAIN --> PMA["Paid Media — 7"]:::div

Full roster with per-agent triggers and specialties: GitHub wiki — Agents · agents/REGISTRY.md.

Synapses — The Skill Layer

If agents are neurons, skills are synapses: the connection that makes a neuron useful for a specific task. A neuron in isolation does nothing. A neuron whose synapses know index-creation-concurrently and query_connectome.py becomes a database optimization specialist.

Skills are loaded on demand via Q1 (TF-IDF cosine similarity) or Q3 (keyword trigger match), injected into the agent's context for the duration of the task, and then either reinforce or decay the agent↔skill edge in the connectome based on 3D Diligent outcome. Lifecycle: ADD (pattern appears 3+ times) → MERGE (two skills converge) → REPLACE (better technique found) → EXTEND (new engine/variant). The full cycle — birth, Hebbian reinforcement, decay (~69-day half-life), failure penalty, pruning, rebirth — is in wiki Skills-System. The full catalog: wiki Skills.

python3 ~/.claude/scripts/query_connectome.py query "<task>"   # which skills fire
python3 ~/.claude/scripts/query_connectome.py gods 15          # hub skills
python3 ~/.claude/scripts/query_connectome.py communities      # skill clusters

Memory — Hippocampus and the Working Set

Three layers: Constitutional (CLAUDE.md — always loaded, 4D rules + reflexes) · Episodic (~/.claude/projects/<cwd>/memory/ — persists across sessions, gitignored, per-machine) · Working (the current context window, cleared on /clear).

Memory is two-tier by scope: generic cross-arm lessons and operator preferences go into a private standalone brain-memory repo (1 + N octopus brains — one central + one per arm). Arm memory is sealed in that arm's own repo. The public framework ships the engine (scripts/memory_sync.py, templates/memory/), never anyone's actual data. Full model: docs/architecture/memory-model.md.

Reflexes — The Spinal Cord Layer

v3.1.0 "Reflexes" — the brain moved from sensing itself (3.0 Proprioception) to enforcing itself. Principles that were advisory prose became involuntary hooks wired at the harness level: they fire whether the model chooses to or not.

Not every behavior needs to go through the cortex. Some are too universal, too fast, too necessary to delegate. The spinal cord handles them: hand pulled from a hot surface, knee jerk, breathing rhythm. No conscious decision, no committee.

The framework has two reflex sub-layers: Tier A cognitive reflexes (constitutional, loaded at session start) and enforcement hooks (harness-level, fire on specific tool events regardless of the model's intent).

Tier A — Cognitive reflexes (6, constitutional)

These fire automatically on every non-trivial task without the agent having to decide:

Reflex	Stimulus	Response
`workspace-skill-discovery`	Session starts in an arm	Load arm-local `.claude/skills/` alongside global skills
`session-memory-search`	About to re-solve a problem	Check git log + grep + Lessons Learned — did we do this before?
`progressive-code-exploration`	About to read a file >100 lines	Default to index-first, fetch-on-demand — 4–8x token savings
`token-efficient-prompting`	Drafting any response	Compact tables, no preamble, no filler
`post-check-verification`	About to declare "done"	Never on a write — always on a verify (build/lint/test/grep)
`dry-run-gate-pattern`	About to do something destructive	Preview/dry-run first; live execution requires explicit opt-in

Enforcement hooks — v3.1.0 additions (3, harness-wired)

These are hooks.json entries that the harness evaluates on every matching tool event. The model cannot skip them.

Hook	Event	Coupling	What it enforces
`delegate-gate` (`scripts/delegate-gate.py`)	PreToolUse	Fail-open	Nudges substantive/batchable work toward the cheapest sufficient model tier (Haiku/Sonnet/Opus); never blocks a turn on failure
`qa-merge-gate` (`scripts/qa-merge-gate.py`)	PreToolUse	Fail-closed	Blocks publish-to-main unless an operator approval the agent provably cannot self-grant is present (`OCTO_MERGE_APPROVE=<pr>` env or `octo-dim approve-merge <pr>`); detection is command-boundary-anchored so it gates real invocations, not quoted mentions
`dimension-awareness-hook` (`scripts/dimension-awareness-hook.py`)	PreToolUse	Fail-open	Warns when other live sessions share the working tree; surfaces the collision risk before a write, never after

Connector verdict enforcement. The 2D Delegate verdict is now inverted by default: SELF is the rare exception, CONNECT is the default. The delegate-gate hook reinforces this at the harness level — answering "from my own knowledge" on a task that has a skill or agent match is caught before the tool fires.

4D Session dimensions — v3.1.0

The brain can run as one session-id across N parallel isolated git worktrees, each reconciled into one .git. This is the octopus superpower applied to time: many arms acting in parallel under one brain, without collision.

scripts/octo-dim.py manages the blackboard registry (connectome/sessions.json, gitignored):

octo-dim worktree-init          # fork a new isolated dimension (new worktree + session id)
octo-dim list                   # show all live dimensions on this machine
octo-dim heartbeat              # signal this dimension is still alive
octo-dim approve-merge <pr>     # grant the qa-merge-gate approval (operator-only)
octo-dim prune                  # remove stale dimension entries

The architecture spec for all three enforcement hooks, the ECA atom formalism, Behavior Tree priority, and the Statechart 4D phase machine lives in docs/architecture/hook-orchestration.md.

Reflexes live in CLAUDE.md (constitutional, loaded before any task), not in skills/ (opt-in). Six constitutional rules + three harness hooks govern thousands of decisions downstream.

Observability — The Sensory Cortex

The brain observes itself acting. Every skill activation, subagent spawn, and 4D phase boundary is captured as a structured JSONL event (schemas/trace-event.schema.json) in ~/.claude/traces/YYYY-MM-DD.jsonl (gitignored, 30-day retention). The trace feeds back into the Hebbian connectome via update_neural_activity.py.

Eight shipped surfaces:

#	Surface	Script
1	Agent Trace (APM-style)	`trace-hook.py` · `brain-trace.py` · `update_neural_activity.py`
2	Skill Cost Profiler	`skill-cost-profiler.py`
3	Brain SLOs + Error Budget	`slos.py`
4	Watchdog (cliff + quality-drop)	`watchdog.py`
5	Brain Digest (daily dashboard)	`brain-digest.py`
6	Incident Capture (post-mortems)	`incident-capture.py`
7	Brain Synthetics (per-arm health)	`arm-synthetics-runner.py`
8	Brain Charts on Demand	`brain-chart.py`

brain-trace.py grep --event phase_boundary --since 1h   # filter traces
brain-trace.py top  --by name --window 7d               # top skills/agents
brain-trace.py tail -n 20 -f                            # live tail

Full schema, storage layout, and cron setup: docs/architecture/trace-storage.md.

Enforcement Scripts

These are not optional helpers. They are the nervous system's enforcement layer — scripts that the agent runs at specific gates to ensure the 4D protocol is followed.

Script	When It Runs	What It Does
`delegate-check`	Start of every task	Parses REGISTRY.md + skills, finds matching agents and skills, outputs ACTIVATE/LOAD/SELF
`query_connectome.py`	Start of every task	TF-IDF cosine similarity against stored document vectors, ranks by semantic match
`gate-check`	Before any file write	Validates that Describe + Delegate phases completed before allowing writes
`generate_neural_map.py`	On every `ai-push`	Rebuilds the full connectome from all agent/skill content
`merge-hooks.py`	On every `ai-pull`	Syncs shared hooks into local settings, validates script targets exist
`eye-check.py`	On every user prompt	Detects web-related tasks, injects browser automation context
`check-generic.py`	Every git commit (pre-commit + commit-msg hooks)	Scans staged files + commit message against `company/brain-blocklist.txt`; hard-blocks commits that leak arm codes, client names, or internal tokens
`check-readme-sync.sh`	Every git commit (pre-commit hook)	Soft-blocks (prompts y/N) when `skills/`, `agents/`, or `scripts/` change but `README.md` is not also staged. Won't break automation (passes through when no TTY)

MCP Servers — The Action Space

The brain talks to the outside world through Model Context Protocol servers. MCP is not a fallback when there's no API — it is the action space of the agent. Agents are the policies that decide what to do; skills are the manuals that teach how; MCP servers are the typed, schema-validated tools the agent actually calls.

Query → Connectome (routing) → Agent persona → Skill (manual) → MCP tool call → Tool response
                                                                                       │
                                       ┌──────────────── Reflection ←──────────────────┘
                                       ▼
                              Reward (3D Diligent PASS/FAIL) → Hebbian log → next routing

Where MCP fits in the 4D paradigm

4D phase	MCP role
1D Describe	If the task names a system (Gmail, Linear, Cloudflare), declare which MCP servers will be used.
2D Delegate — Q2	"¿Tiene API?" is MCP-first: prefer an MCP tool over scraping. MCP > REST > SDK > scrape (in capability terms; tokens-wise REST is cheaper, so the agent chooses based on whether typed schema/auth/persistence matter for this call).
3D Diligent	Validate via the same MCP server: the response shape is the test.
4D Disclose	Impact Radius includes external state — what got written to Linear / sent through Gmail / deployed to Cloudflare.

Server registry & secret management

Layer	File	Synced?	Contains
Global config	`~/.claude/mcp/servers.json` (P2 roadmap — not yet in repo)	Planned	Server `{id, transport, command
Per-arm override	`<arm>/.claude/mcp/servers.local.json`	No (arm-local)	Client-specific MCP endpoints that must not leak across arms
Secrets	`~/.config/octorato/secrets.env` (chmod 600) or system keychain	No — never synced	Tokens, API keys, OAuth refresh — resolved at startup by `env_refs[]`
Capability cache	`~/.claude/mcp/capabilities/<server_id>.json`	Planned	Tool manifest fetched at connect time, dated

Secret resolution order: env var → user keychain (security/secret-tool/wincred) → company vault → fail closed (never prompt mid-task).

Per-arm isolation parity: MCP follows the same arm-isolation rule as everything else. An arm's MCP config never leaks into the global registry. Arm-to-arm MCP sharing requires explicit human action — same as code-level arm isolation.

Currently common MCP servers in this brain

Server	Used for	Skill that loads it
Cloudflare Developer Platform	Workers, D1, R2, KV, Hyperdrive ops	`cloudflare-deploy`
Gmail	Drafts, threads, labels (operator's mailbox)	`notion-research-documentation` (when source is mail)
Google Calendar	Event read/write, scheduling	`notion-meeting-intelligence`
Google Drive	File search + content read	`notion-research-documentation`
Microsoft Learn	Official Azure/.NET docs lookup	`aspnet-core`, `winui-app`
Notion	Doc create/update, knowledge capture	`notion-knowledge-capture`, `notion-spec-to-implementation`
Linear	Issue read/update, project tracking	`linear`
Sentry	Production error inspection	`sentry`
Figma	Design context, node-to-code	`figma`, `figma-implement-design`

MCP as a routing signal (roadmap)

Today MCP servers are not first-class neurons in the connectome — Q2 is a mental check, not a graph query. The roadmap (P2) treats every MCP tool as a mcp_tool node alongside agents and skills:

query_connectome.py query "send slack message" → returns mcp_tool: slack-send (score 0.94)
Operator's situated state (active Linear issue, next Calendar event, recent Drive files) fuses with the query vector, so retrieval becomes context-aware without the operator typing the context

This is the path the framework is on — see "10x Roadmap" below.

Adding a new MCP server

mcp/servers.json is P2 roadmap — the mcp/ directory does not yet exist in the repo. Today, MCP servers are configured directly in Claude Code's settings. Steps below describe the planned workflow once P2 lands.

Add the server to ~/.claude/mcp/servers.json with env_refs pointing to your secret names (no values).
Put the actual secrets in ~/.config/octorato/secrets.env (chmod 600, gitignored).
Run ai-push — the server config syncs; the secrets do not.
On other machines, ai-pull brings the config; add the matching secrets locally.

Multi-Tool Support

The brain works simultaneously with three AI coding assistants:

Tool	Config File	Synced By
Claude Code	`.claude/CLAUDE.md`	Source of truth (edit here)
GitHub Copilot	`.github/copilot-instructions.md`	Auto-copied by `sync-ai-docs`
Cursor	`.cursorrules`	Auto-copied by `sync-ai-docs`

One file to maintain. Three tools stay in sync.

sync-ai-docs          # Sync all arms
sync-ai-docs my-client  # Sync one arm

Multi-Machine Sync — The Glial Layer

In real brains, glial cells outnumber neurons roughly 1:1 and do the unsexy work: shuttling nutrients, insulating axons, cleaning up waste, keeping the neurons alive. They don't fire signals themselves — they make signal-firing possible.

The framework's glial layer is the sync + hooks infrastructure: ai-push, ai-pull, sync-ai-docs, install-git-hooks.sh, merge-hooks.py, check-generic.py, check-readme-sync.sh. None of these are agents. None are skills. They don't show up in the connectome. But every agent and skill depends on them being alive: distributing the brain to all workstations, enforcing the generic-leak guard, keeping arm CLAUDE.mds in sync.

The brain is a git repo. The glia are what make it portable.

# Push brain changes (primary machine)
ai-push "added skill: playwright"

# Pull latest brain (any other machine)
ai-pull

# Check if updates available
ai-pull --status

Repository Structure

~/.claude/
├── CLAUDE.md                ← Global rules (The Octopus Constitution)
├── README.md                ← You are here
├── LICENSE                  ← MIT
├── CONTRIBUTING.md          ← How to add agents, skills, contribute
├── HEBBIAN_LEARNING.md      ← How the connectome learns over time
├── hooks.json               ← Shared hooks (source of truth, synced to all machines)
├── neural_map.json          ← The Deep Connectome (auto-generated, never edit)
├── agents/                  ← 160+ specialist agents
│   ├── REGISTRY.md          ← Auto-activation triggers & cross-references
│   ├── engineering/         ← 28 agents
│   ├── design/              ← 8 agents
│   ├── marketing/           ← 30 agents
│   ├── sales/               ← 8 agents
│   ├── product/             ← 5 agents
│   ├── project-management/  ← 6 agents
│   ├── testing/             ← 8 agents
│   ├── support/             ← 7 agents
│   ├── specialized/         ← 29 agents
│   ├── spatial-computing/   ← 6 agents
│   ├── game-development/    ← 20 agents
│   ├── academic/            ← 5 agents
│   ├── paid-media/          ← 7 agents
│   ├── strategy/            ← NEXUS orchestration playbooks and runbooks
│   └── examples/            ← Multi-agent workflow examples
├── skills/                  ← 210+ reusable techniques
├── scripts/
│   ├── ai_sync.py                 ← Multi-machine sync engine (pull/push/sync/status verbs)
│   ├── generate_neural_map.py     ← Connectome generator (TF-IDF + cosine + Hebbian)
│   ├── query_connectome.py        ← Suction cups — graph search for agent/skill matching
│   ├── delegate-check             ← 2D pre-research gate
│   ├── gate-check                 ← 4D change gate enforcement
│   ├── merge-hooks.py             ← Hook sync with script-exists validation
│   ├── eye-check.py               ← Browser automation detector
│   ├── trace-hook.py              ← Observability capture hook (trace events)
│   ├── brain-trace.py             ← Observability query CLI (grep / top / tail)
│   ├── brain-chart.py             ← Observability charts on demand (ASCII / SVG)
│   ├── brain-digest.py            ← Daily aggregator report
│   ├── watchdog.py                ← Anomaly detector (cliff + quality drops)
│   ├── slos.py                    ← SLO evaluator + error-budget burn rate
│   ├── skill-cost-profiler.py     ← Per-skill token cost ranking
│   ├── incident-capture.py        ← Structured post-mortem writer
│   ├── arm-synthetics-runner.py   ← Per-arm health-check probe runner
│   ├── _brain_obs.py              ← Shared library for the 10 obs scripts (private)
│   ├── update_neural_activity.py  ← Hebbian update from trace co-activations
│   ├── scan-external-refs         ← Scan for external URL references
│   ├── delegate-gate.py           ← v3.1 hook: model-tier routing nudge (fail-open)
│   ├── qa-merge-gate.py           ← v3.1 hook: operator-approval gate for merges (fail-closed)
│   ├── dimension-awareness-hook.py← v3.1 hook: shared-worktree collision warning (fail-open)
│   ├── octo-dim.py                ← v3.1: session-dimension registry (worktree-init / list / approve-merge)
│   ├── ai-push.ps1                ← PowerShell variant for Windows
│   ├── ai-pull.ps1                ← PowerShell variant for Windows
│   └── sync-ai-docs.ps1           ← PowerShell variant for Windows
├── schemas/                  ← JSON schemas for structured artifacts
│   ├── trace-event.schema.json ← Trace event contract (v1.0, strict)
│   └── tests/trace-samples/    ← 4 validating sample records
├── docs/                     ← Architecture + design docs
│   ├── architecture/
│   │   └── hook-orchestration.md ← v3.1 Reactive Control Architecture spec (ECA · BT · Statechart · Bandit)
│   └── trace-storage.md        ← Trace storage layout + retention + backup
├── traces/                   ← (gitignored) Per-UTC-day JSONL trace files
├── commands/                ← Slash command definitions
├── templates/
│   ├── company/             ← Template for your private company brain
│   ├── arm/                 ← Template for new client projects
│   └── skill/               ← Template for new skills
└── company/                 ← YOUR private brain (gitignored, never committed)
    ├── COMPANY.md           ← Your identity, arms, connections
    ├── skills/              ← Your company-specific skills
    ├── assets/              ← Your signatures, logos, etc.
    └── config/              ← Your arm definitions, connection registry

10x Roadmap

The framework is structurally sound but its retrieval and learning loop are 2012-era. A 6-discipline independent review (Data Architecture, Python, Cephalopod Neuroscience, Applied Mathematics, Data Science, Neural Networks) converged on three families of upgrades. Numbers are estimated lifts — they become measurements once the eval framework is in place.

Now shipping — P0 (correctness + clarity)

Hebbian noise sink fix — query_connectome.py was logging ~99 nodes / ~4,800 co-activation pairs per query, collapsing the learning signal to "everything connects to everything". Capped at top-5 by score.
Atomic writes + flock on neural_activity.json. No more silent loss under concurrent queries.
Reward loop closed — gate-check --phase diligent PASS|FAIL writes back to neural_activity.json, so the negative-weight infrastructure in generate_neural_map.py finally receives signal. Until now, 100% of sessions were logged success=true — dead branch.
Fail-closed fuzzy match — ambiguous node lookups no longer silently pick candidates[0]. They return None after surfacing similarity-ranked options.
Stopword consolidation (EN + ES) — index-time and query-time tokenization now share the same STOP_WORDS (including Spanish), so the same prompt produces the same vector.
Description-extractor regex — no more "## Quick Reference" being captured as a skill description.
UTF-8 encoding everywhere — Windows-safe; emojis/Spanish no longer crash merge-hooks.py.

Next — P1 (measurement + retrieval quality)

Move	Expected lift	Source
Build a labeled eval set from `REGISTRY.md` triggers (silver labels — no manual annotation needed)	Converts every later change from belief to delta	DS review
TF-IDF → BM25 (k1=1.2, b=0.75, smoothed IDF)	MRR +0.10–0.15	Math review
Cross-encoder rerank (`bge-reranker-base`, CPU, top-20)	MRR +0.20, P@1 +0.25	DS review (single biggest win)
Reciprocal Rank Fusion between cosine and `delegate-check`	MRR +0.05–0.10, Q1↔Q3 agreement 70% → 90%	Math review
Bayesian Beta-Bernoulli Hebbian — replaces the two divergent boost formulas	Stability + principled cold-start	Math review
`pyproject.toml` + pytest + CI — `pipx install octopus-brain` becomes one line	Distribution, regression nets	Python review

Then — P2 (architecture)

Lakehouse storage — Bronze (append-only NDJSON sessions) / Silver (TF-IDF postings parquet, co-activation rollup) / Gold (UUID-keyed nodes + edges parquet). Replaces the monolithic 4-MB neural_map.json. Enables scaling from 309 docs → 30,000.
MCP as first-class neurons in the connectome. Q2 stops being prose and becomes a graph query: query_connectome.py query "send slack message" → mcp_tool: slack-send (score 0.94).
MCP as situational signal — fuse operator state (active Linear issue, next calendar event, recent Drive files) into the query vector. Routing becomes context-aware without the operator typing the context. Est. +0.15 MRR.
Learned router head — small MLP query_embedding → agent_logits trained on (task, agent, success) tuples. Converts the static gate into a learned policy.
Top-K ensemble routing — let top-2 agents fire on MEDIUM/LARGE tasks (e.g., Security Engineer + Database Optimizer for "threat-model and refactor this stored proc").
Episodic memory — index docs/specs-archive/ as retrievable exemplars; few-shot the next similar task with past successful plans, respecting arm isolation.
Sleep / consolidation cron — offline pass replays neural_activity.json, prunes weak edges, proposes skill merges, synthesizes skill candidates. Today the brain rebuilds (recompilation), not consolidates (memory consolidation).

The shape of the upgrade is consistent across all six reviewers: keep the octopus as the metaphor for operator + arm isolation (which is genuinely novel), and rebuild the retrieval/learning core using standard ML primitives (dense + sparse retrieval, cross-encoder rerank, reward signal, episodic memory).

Contributing

See CONTRIBUTING.md for how to:

Add a new agent (which division, file format, REGISTRY update)
Add a new skill (directory structure, SKILL.md format)
Report issues and submit PRs
All contributions must be anonymized — no client data, no personal information

License

MIT

Octorato powers the AI Agent OS at dataqbs.com — built & operated there.

Created by dataqbs — Data Quality & Business Solutions