buildaharness

mcp
Guvenlik Denetimi
Basarisiz
Health Uyari
  • License — License: NOASSERTION
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Low visibility — Only 6 GitHub stars
Code Basarisiz
  • eval() — Dynamic code execution via eval() in .github/workflows/deploy.yml
  • eval() — Dynamic code execution via eval() in adapter/crewai_adapter.py
  • exec() — Shell command execution in adapter/crewai_adapter.py
Permissions Gecti
  • Permissions — No dangerous permissions requested

Bu listing icin henuz AI raporu yok.

SUMMARY

A visual harness builder for ai agents

README.md

Build A Harness

Build complete AI agent harnesses on canvas. Compile to any orchestrator. Observe with Langfuse.

License
Version
Status
Tests
GitHub Stars
PRs Welcome
Python
Node.js
Docker

English · 中文


A workflow routes prompts from node to node. A harness governs what the agent believes, what it is allowed to do, how it catches its own mistakes, and what it learns. Build A Harness delivers the complete 11-layer architecture — draw it on a canvas, compile to any framework, trace every decision.

Canvas  →  flow.json  →  LangGraph · CrewAI · Mastra · MS Agent Framework  →  Langfuse

The spec is the contract. The canvas is the editor. The adapters are the compilers.


Why a harness, not just a workflow

Simple Agent Loop Full Harness — Implemented
Input / Caller Caller State — constraints · clarification
World Model — beliefs · contradictions · generation_id
LLM Call Reasoning — evidence · hypotheses (4 sources) · VOI
Controlkey — 5-tier resolver · NORMAL / CAUTIOUS / BLOCKED
Tool Call ↺ loop Planning — task graph (6-state) · parallel concurrency
Execution + Verification — VOI gate · 9 layers
Output Recovery + Memory — 6 strategies · compression
Learning — experience store · warm start (optional)
Output & Reviewer Pass — contract · 3-lens review
prompt in → answer out 22 nodes · 11 layers · 379 tests passing

What's implemented

Canvas & execution layer

  • ✅ Canvas with 27 node types (14 execution + 13 harness)
  • ✅ 4 framework adapters — LangGraph, CrewAI, Mastra, MAF
  • ✅ Langfuse observability — harness traces across all runtimes
  • ✅ HITL pause/resume · REST / MCP / A2A deploy
  • ✅ FlowSpec v0.2.0 — open, portable JSON format
  • ✅ Process concepts — pre-seeded task graph scaffolds

Reasoning & control layer

  • ✅ World model · typed beliefs · contradiction detection
  • ✅ 5-tier control state resolver · deadlock detection
  • ✅ Pre-execution review gate · 9-layer verification
  • ✅ 6 named recovery strategies · typed failure library
  • ✅ Experience store — cross-run structural reuse
  • ✅ Adversarial reviewer pass · output contract validation

Node palette

Harnesses are built from 14 core nodes and 13 harness-layer nodes — every node compiles to all four runtimes. Hover a node name for its description.

Core nodes
input output llm_call 🔧 tool_invoke
condition parallel_fork parallel_join hitl_breakpoint
📖 memory_read 🔖 memory_write 📦 subgraph transform
🤖 agent_role 👥 agent_debate
Harness nodes — implement the 11-layer control architecture
🧠 world_model 💡 hypothesis_set 🗄️ gather_evidence ⚙️ apply_tool_rel
🔄 update_wm 🛡️ control_state 🕸️ task_graph verify_gate
♻️ recovery 📋 evidence_store 📊 exp_store 👁️ reviewer_pass
🧭 process_concept

Full architecture, pseudo-code, and state model: plan/harness_architecture.html


Frameworks

All four runtimes compile from the same flow.json — no rewriting.

Runtime Language HITL Key integration
LangGraph Python interrupt() @observe · harness child spans
CrewAI Python context_from → Task.context · tier-aware memory
Mastra TypeScript suspend()/resume() Node.js sidecar
MS Agent Framework Python _HitlPause AgentGroupChat native · OTel → Langfuse

Compile: POST /compile?runtime=langgraph — same spec, any runtime.
Deploy as a REST endpoint, MCP tool, or A2A agent in one step.


Observability

Self-hosted Langfuse starts with docker compose up — no extra configuration needed.

  • Per-node child spans across all four runtimes (world model, control state, verification, recovery)
  • Token counts, latency, and cost per node via LiteLLM
  • Live View trace → link in the canvas after each run
  • Managed prompts via Langfuse prompt API (prompt_ref on any llm_call node)

Quick start

./scripts/setup-env.sh   # generate secrets, write .env
docker compose up        # start all 9 services
Service URL
Canvas http://localhost:3000
Adapter API http://localhost:8000/health
Langfuse http://localhost:3001
Without Docker
./scripts/setup-env.sh && source adapter/.venv/bin/activate
npm install && npm run dev        # canvas → localhost:3000
cd adapter && python main.py      # adapter → localhost:8000
Running tests
npm test                                         # Vitest — validates 5 reference flows
pytest adapter/tests/ -v                         # adapter unit + integration
pytest adapter/tests/test_maf_adapter.py -v     # MAF suite (742 tests)

New here? Start with docs/getting-started.md · Startup errors? docs/troubleshooting.md · Real-time collaboration: docs/collab.md · On-prem / Kubernetes: docs/deployment.md


LLM providers

All calls route through LiteLLM — add the key to .env.

Provider Env var Example models
OpenAI OPENAI_API_KEY gpt-4o, gpt-4o-mini
Anthropic ANTHROPIC_API_KEY claude-sonnet, claude-opus
Ollama (local) mistral, qwen3, qwen2.5-coder

No API key? Install Ollama, run ollama pull mistral, then ./scripts/setup-ollama.sh — tests all four frameworks with no paid account.

Full setup: docs/llm-setup.md


Embed the canvas

npm install @buildaharness/canvas
import { BuildAHarnessCanvas } from '@buildaharness/canvas'
import '@buildaharness/canvas/styles.css'

<BuildAHarnessCanvas
  initialSpec={mySpec}
  onSpecChange={(updated) => save(updated)}
  execStats={runState.nodeStats}
  theme="dark"
/>

Full props reference: packages/canvas/README.md


Documentation

docs/getting-started.md Step-by-step: clone → secrets → LLM → first run
docs/flowspec.md FlowSpec v1.0.0 — all 26 node types, edges, fields
docs/architecture.md System design, service interactions, data flows
docs/api.md REST API reference — compile, execute, deploy, HITL resume
docs/llm-setup.md LLM provider setup — OpenAI, Anthropic, Ollama, custom
docs/qdrant.md Qdrant vector store — seeding, collections, production
docs/env-vars.md All environment variables across all services
docs/collab.md Real-time collaboration — Yjs setup and internals
docs/deployment.md Docker, Helm, SSO/OIDC
docs/troubleshooting.md Common startup errors
CONTRIBUTING.md How to contribute

Apache 2.0 — see LICENSE.

Yorumlar (0)

Sonuc bulunamadi