buildaharness
Health Warn
- License — License: NOASSERTION
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Low visibility — Only 6 GitHub stars
Code Fail
- eval() — Dynamic code execution via eval() in .github/workflows/deploy.yml
- eval() — Dynamic code execution via eval() in adapter/crewai_adapter.py
- exec() — Shell command execution in adapter/crewai_adapter.py
Permissions Pass
- Permissions — No dangerous permissions requested
No AI report is available for this listing yet.
A visual harness builder for ai agents
Build A Harness
Build complete AI agent harnesses on canvas. Compile to any orchestrator. Observe with Langfuse.
A workflow routes prompts from node to node. A harness governs what the agent believes, what it is allowed to do, how it catches its own mistakes, and what it learns. Build A Harness delivers the complete 11-layer architecture — draw it on a canvas, compile to any framework, trace every decision.
Canvas → flow.json → LangGraph · CrewAI · Mastra · MS Agent Framework → Langfuse
The spec is the contract. The canvas is the editor. The adapters are the compilers.
Why a harness, not just a workflow
| Simple Agent Loop | Full Harness — Implemented |
|---|---|
| Input / Caller | Caller State — constraints · clarification |
| ↓ | World Model — beliefs · contradictions · generation_id |
| LLM Call | Reasoning — evidence · hypotheses (4 sources) · VOI |
| ↓ | Control ← key — 5-tier resolver · NORMAL / CAUTIOUS / BLOCKED |
| Tool Call ↺ loop | Planning — task graph (6-state) · parallel concurrency |
| ↓ | Execution + Verification — VOI gate · 9 layers |
| Output | Recovery + Memory — 6 strategies · compression |
| Learning — experience store · warm start (optional) | |
| Output & Reviewer Pass — contract · 3-lens review | |
| prompt in → answer out | 22 nodes · 11 layers · 379 tests passing |
What's implemented
|
Canvas & execution layer
|
Reasoning & control layer
|
Node palette
Harnesses are built from 14 core nodes and 13 harness-layer nodes — every node compiles to all four runtimes. Hover a node name for its description.
| Core nodes | |||
|---|---|---|---|
⤵ input |
⤴ output |
✨ llm_call |
🔧 tool_invoke |
⎇ condition |
⑂ parallel_fork |
⊖ parallel_join |
⏸ hitl_breakpoint |
📖 memory_read |
🔖 memory_write |
📦 subgraph |
⇌ transform |
🤖 agent_role |
👥 agent_debate |
| Harness nodes — implement the 11-layer control architecture | |||
|---|---|---|---|
🧠 world_model |
💡 hypothesis_set |
🗄️ gather_evidence |
⚙️ apply_tool_rel |
🔄 update_wm |
🛡️ control_state |
🕸️ task_graph |
✅ verify_gate |
♻️ recovery |
📋 evidence_store |
📊 exp_store |
👁️ reviewer_pass |
🧭 process_concept |
Full architecture, pseudo-code, and state model: plan/harness_architecture.html
Frameworks
All four runtimes compile from the same flow.json — no rewriting.
| Runtime | Language | HITL | Key integration |
|---|---|---|---|
| LangGraph | Python | interrupt() |
@observe · harness child spans |
| CrewAI | Python | — | context_from → Task.context · tier-aware memory |
| Mastra | TypeScript | suspend()/resume() |
Node.js sidecar |
| MS Agent Framework | Python | _HitlPause |
AgentGroupChat native · OTel → Langfuse |
Compile: POST /compile?runtime=langgraph — same spec, any runtime.
Deploy as a REST endpoint, MCP tool, or A2A agent in one step.
Observability
Self-hosted Langfuse starts with docker compose up — no extra configuration needed.
- Per-node child spans across all four runtimes (world model, control state, verification, recovery)
- Token counts, latency, and cost per node via LiteLLM
- Live View trace → link in the canvas after each run
- Managed prompts via Langfuse prompt API (
prompt_refon anyllm_callnode)
Quick start
./scripts/setup-env.sh # generate secrets, write .env
docker compose up # start all 9 services
| Service | URL |
|---|---|
| Canvas | http://localhost:3000 |
| Adapter API | http://localhost:8000/health |
| Langfuse | http://localhost:3001 |
./scripts/setup-env.sh && source adapter/.venv/bin/activate
npm install && npm run dev # canvas → localhost:3000
cd adapter && python main.py # adapter → localhost:8000
Running tests
npm test # Vitest — validates 5 reference flows
pytest adapter/tests/ -v # adapter unit + integration
pytest adapter/tests/test_maf_adapter.py -v # MAF suite (742 tests)
New here? Start with docs/getting-started.md · Startup errors? docs/troubleshooting.md · Real-time collaboration: docs/collab.md · On-prem / Kubernetes: docs/deployment.md
LLM providers
All calls route through LiteLLM — add the key to .env.
| Provider | Env var | Example models |
|---|---|---|
| OpenAI | OPENAI_API_KEY |
gpt-4o, gpt-4o-mini |
| Anthropic | ANTHROPIC_API_KEY |
claude-sonnet, claude-opus |
| Ollama (local) | — | mistral, qwen3, qwen2.5-coder |
No API key? Install Ollama, run
ollama pull mistral, then./scripts/setup-ollama.sh— tests all four frameworks with no paid account.
Full setup: docs/llm-setup.md
Embed the canvas
npm install @buildaharness/canvas
import { BuildAHarnessCanvas } from '@buildaharness/canvas'
import '@buildaharness/canvas/styles.css'
<BuildAHarnessCanvas
initialSpec={mySpec}
onSpecChange={(updated) => save(updated)}
execStats={runState.nodeStats}
theme="dark"
/>
Full props reference: packages/canvas/README.md
Documentation
| docs/getting-started.md | Step-by-step: clone → secrets → LLM → first run |
| docs/flowspec.md | FlowSpec v1.0.0 — all 26 node types, edges, fields |
| docs/architecture.md | System design, service interactions, data flows |
| docs/api.md | REST API reference — compile, execute, deploy, HITL resume |
| docs/llm-setup.md | LLM provider setup — OpenAI, Anthropic, Ollama, custom |
| docs/qdrant.md | Qdrant vector store — seeding, collections, production |
| docs/env-vars.md | All environment variables across all services |
| docs/collab.md | Real-time collaboration — Yjs setup and internals |
| docs/deployment.md | Docker, Helm, SSO/OIDC |
| docs/troubleshooting.md | Common startup errors |
| CONTRIBUTING.md | How to contribute |
Apache 2.0 — see LICENSE.
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found