Cairn

More Than Just AI Penetration Testing — Towards General State-Space Search

Tencent TCH

Cairn is a general-purpose problem-solving engine.
It defines no roles, no workflows. Given an origin and a goal, it searches for a path through an unknown state space.
AI Penetration Testing is one such problem — and a proven one.

What is Cairn?

Penetration testing is fundamentally a directed search through a near-infinite state space:

Origin: known (target IP, target system)
Goal: defined (get a shell, capture the flag)
Path: unknown

This structure is not unique to penetration testing. Vulnerability research, mathematical proof, CTF challenges — any problem with a clear starting point, a clear success condition, and an unknown path in between shares the same shape.

Cairn is built for this class of problems. Penetration testing is the first domain it has been validated on.

The engine is built on a Blackboard Architecture with an explicit fact-intent graph. Three primitives are all it needs:

Concept	Meaning
Fact	A confirmed, objective finding written to the board
Intent	A declared direction of exploration, not yet executed
Hint	Human judgment injected at any time; absorbed by agents on the next read

The graph grows from origin toward goal. Every new Fact is a stepping stone; every Intent is a step into the unknown.

Agent Workers run an OODA loop — Observe the full graph, Orient to the current state, Decide on next intents, Act to explore — and write their findings back as new Facts. Workers have no fixed roles. Tasks are generated at runtime from the graph's current state, not from predefined job descriptions.

Agents coordinate exclusively through the shared board (Stigmergy). No direct communication. No information silos.

Cairn in Action

https://github.com/user-attachments/assets/e557b1ac-dda4-41cb-87dd-9d56dbf05133

How It Works

Three task types, all executed by the same Worker:

Task	What it does	Output
Bootstrap	At project start, attempts to solve the problem directly	Fact + possible Complete
Reason	Reads the full graph: is the goal met? What should be explored next?	Complete / new Intents / no-op
Explore	Claims one Intent, executes the exploration, reports findings	One Fact

System architecture:

          ┌──────────────────────────────────┐
          │           Cairn Server           │
          │    Facts + Intents + Hints       │
          └─────────────────┬────────────────┘
                            │
                     Read / Write API
                            │
          ┌─────────────────┴────────────────┐
          │             Dispatcher           │
          │   Schedules tasks, manages       │
          │   containers, writes protocol    │
          └──────────┬───────────────┬───────┘
                     │               │
     ┌───────────────┴──┐     ┌──────┴──────────────┐
     │  Worker Container│     │  Worker Container   │
     │   (Project A)    │     │   (Project B)       │
     │  ┌────┐  ┌────┐  │     │  ┌────┐  ┌────┐     │
     │  │ W. │  │ W. │  │     │  │ W. │  │ W. │     │
     │  └────┘  └────┘  │     │  └────┘  └────┘     │
     └──────────────────┘     └─────────────────────┘

Cairn Server maintains graph consistency only.

Cairn Dispatcher reads the graph, schedules tasks, spins up and tears down worker containers, and is the sole writer to the protocol. Each project gets its own Worker Container; multiple Agent Workers run concurrently inside it. Agent Workers only receive a prompt and return structured output.

Results

Tencent Cloud Hackathon · AI Penetration Testing Challenge · 2nd Edition

610 teams · 1,345 participants · top universities and security firms across China

Metric	Value
Problems solved	54 / 54 — only team to AK
Final ranking	3rd

The system had never been tested before the competition. The full pipeline came online for the first time at 4 AM on race day. No training, no tuning, no domain-specific tooling. Zero MCP tools, zero RAG, zero predefined agent roles.

Getting Started

Prerequisites

macOS or Linux
Python ≥ 3.12
Docker

Pull required images

Both setup methods require the worker container image:

docker pull --platform=linux/amd64 ghcr.io/oritera/cairn-worker-container:latest

Docker Compose (recommended)

Pull the base image used to build Cairn:

docker pull ghcr.io/astral-sh/uv:python3.13-trixie

Edit dispatch.yaml and fill in your LLM endpoints and API keys, then start both services:

docker compose up --build

This starts cairn-server on port 8000 and cairn-dispatcher once the server passes its health check. The dispatcher mounts dispatch.yaml from the project root and connects to Docker via the host socket. Data is persisted to ./datas/cairn/.

Manual

Edit dispatch.yaml and fill in your LLM endpoints and API keys, then:

# Start the server
uv run --project cairn cairn serve
 
# Run the dispatcher
uv run --project cairn cairn dispatch --config dispatch.yaml
 
# Run startup health checks only
uv run --project cairn cairn dispatch --config dispatch.yaml --startup-healthcheck-only

⚖️ License

This project is licensed under GNU AGPLv3 for personal and educational use.

Commercial Use: If you wish to use this project in a commercial or proprietary environment without the AGPL-3.0 open-source obligations, please contact me to obtain a commercial license.

Contributions: By submitting a Pull Request, you agree that your contributions may be used under both the AGPL-3.0 and the project's commercial license.