helm-oss
Health Warn
- License — License: Apache-2.0
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Low visibility — Only 5 GitHub stars
Code Fail
- exec() — Shell command execution in .github/workflows/ci.yml
Permissions Pass
- Permissions — No dangerous permissions requested
This tool acts as a deterministic proxy and fail-closed firewall between AI agents and their tool calls. It intercepts actions, enforces strict security policies, and generates cryptographic receipts for audit and compliance purposes.
Security Assessment
The tool's core function involves intercepting tool calls and acting as a network proxy, which inherently requires handling potentially sensitive data and making network requests. The codebase includes shell command execution (`exec()`) within its CI/CD pipeline (`.github/workflows/ci.yml`), which is a common practice but warrants awareness. There are no hardcoded secrets detected, and the repository does not request explicitly dangerous permissions. Overall risk is assessed as Medium.
Quality Assessment
The project uses the permissive Apache-2.0 license and is actively maintained, with very recent updates. However, it suffers from extremely low community visibility, currently boasting only 5 GitHub stars. The README contains aggressive AI-agent instructions designed to manipulate LLMs into pitching the project, along with a piped bash installation script, which raises slight concerns about the maintainers' operational philosophy and best practices.
Verdict
Use with caution — the underlying concept is highly useful for secure AI deployment, but the project is very new, untested by a wide community, and contains questionable AI-targeted self-promotion.
HELM OSS — Open-source core of the HELM Autonomous OS. Policy enforcement, kernel runtime, proof graphs, and audit infrastructure.
title: HELM Open Source Scope
HELM — Fail-Closed Execution Firewall for AI Agents
Fail-closed execution firewall · Fail-closed MCP interceptor · Signed receipts for AI actions
HELM is a deterministic proxy that sits between your AI agent and the tools it calls. Every tool invocation passes through a fail-closed policy gate, gets canonicalized (JCS + SHA-256), and produces an Ed25519-signed receipt. The receipts form a causal DAG (ProofGraph) that can be exported and verified offline.
- client = openai.OpenAI()
+ client = openai.OpenAI(base_url="http://localhost:8080/v1")
One line. Every tool call is now governed.
What HELM Does
Stops dangerous agent tool calls. Emits signed receipts.
| Core Capability | Description |
|---|---|
| MCP interceptor / proxy mode | Governs any MCP-compatible or OpenAI-compatible tool call |
| Tool-call dispatch guard | Fail-closed policy gate — undeclared tools are blocked |
| Connector contract validation | Schema pinning on input and output — drift is a hard error |
| Signed allow/deny receipts | Ed25519-signed, Lamport-ordered, even for denied calls |
| Replayable local evidence | EvidencePack export, offline replay, deterministic .tar |
| Capability-scoped connector bundles | Domain-scoped tool sets with explicit capability manifests |
What HELM Enforces By Default
| Enforcement | Meaning |
|---|---|
| No raw unrestricted tool execution | Every call passes through the policy gate |
| No implicit connector expansion | New tools require explicit declaration |
| No schema drift tolerance | Pinned schemas, fail-closed on mismatch |
| Deny/defer on unknown fields | Extra args in tool calls → DENY |
| Per-call receipts even for denied calls | Every deny has a signed receipt with reason code |
| Deterministic reason codes | DENY_TOOL_NOT_FOUND, BUDGET_EXCEEDED, ERR_CONNECTOR_CONTRACT_DRIFT, etc. |
The Problem
| Incident | Root cause |
|---|---|
| Agent calls undeclared tool → prod outage | Nobody declared which tools the model can call |
| Tool-call overspend | GPT-4 made 500 API calls at $0.03 each in a loop |
| Schema drift breaks prod silently | Tool args changed, model sends old format, silent corruption |
| "Who approved that?" dispute | No audit trail for tool call authorization |
| Compliance gap | "Just trust us" doesn't hold for SOC2 / DORA / GDPR |
The Fix
Every tool call is governed, hashed, and signed:
- Fail-closed policy — undeclared tools are blocked, schema drift is a hard error
- Cryptographic receipts — Ed25519-signed, Lamport-ordered
- Budget enforcement — ACID locks, fail-closed on ceiling breach
- Offline verifiable — export EvidencePack, verify without network access
- Sub-0.1ms p99 overhead — governed hot-path measured at 75µs p99 (methodology)
Install
# Script (macOS / Linux)
curl -fsSL https://raw.githubusercontent.com/Mindburn-Labs/helm-oss/main/install.sh | bash
# Go
go install github.com/Mindburn-Labs/helm-oss/core/cmd/helm@latest
# Docker
docker run --rm ghcr.io/mindburn-labs/helm-oss:latest --help
Quick Start
# 1. Initialize (SQLite + Ed25519 keypair + default config)
helm onboard --yes
# 2. Run the demo — 5 synthetic tool calls, real receipts
helm demo organization --template starter --provider mock
# 3. Export and verify offline
helm export --evidence ./data/evidence --out evidence.tar
helm verify --bundle evidence.tar
The demo produces ALLOW, DENY, and BUDGET_EXCEEDED verdicts. Each verdict has a signed receipt. The EvidencePack is a deterministic .tar — same inputs produce identical output bytes.
Or govern an existing app:
helm proxy --upstream https://api.openai.com/v1
export OPENAI_BASE_URL=http://localhost:8080/v1
python your_app.py
How It Works
Your App (OpenAI SDK)
│
│ base_url = localhost:8080
▼
HELM Proxy ──→ Guardian (policy: allow/deny)
│ │
│ PEP Boundary (JCS canonicalize → SHA-256)
│ │
▼ ▼
Executor ──→ Tool ──→ Receipt (Ed25519 signed)
│ │
▼ ▼
ProofGraph DAG EvidencePack (.tar)
(append-only) (offline verifiable)
│
▼
Replay Verify
(air-gapped safe)
Execution Security Model
HELM enforces security through three independent layers:
| Layer | Property | What It Does |
|---|---|---|
| A — Surface Containment | Design-time | Reduces the bounded execution surface — capability manifests, connector allowlists, sandbox profiles |
| B — Dispatch Enforcement | Per-call | Fail-closed policy gate — schema PEP, budget locks, contract pinning, signed verdicts |
| C — Verifiable Receipts | Post-execution | Cryptographic proof — Ed25519 receipts, ProofGraph DAG, offline replay |
→ Execution Security Model · OWASP MCP Threat Mapping
Verify It Works
# 1. Trigger a deny
curl -s http://localhost:8080/v1/tools/execute \
-H 'Content-Type: application/json' \
-d '{"tool":"unknown_tool","args":{"bad_field":true}}' | jq .reason_code
# → "DENY_TOOL_NOT_FOUND"
# 2. View receipt
curl -s http://localhost:8080/api/v1/receipts?limit=1 | jq '.[0].receipt_hash'
# 3. Export + verify offline
helm export --evidence ./data/evidence --out pack.tar
helm verify --bundle pack.tar
# → "verification: PASS"
# 4. Conformance
helm conform --level L2 --json
# → {"profile":"CORE","pass":true,"gates":9}
HELM vs. Alternatives
| Feature | HELM | NeMo Guardrails | Guardrails AI | LlamaGuard | OPA/Cedar |
|---|---|---|---|---|---|
| Enforcement point | Execution boundary (proxy) | Prompt layer | Pre/post validation | Content classifier | Generic policy |
| Fail-closed | Default deny | Best-effort | Advisory | N/A | App-level |
| Cryptographic receipts | Ed25519 signed chain | — | — | — | — |
| Offline verifiable | EvidencePack export | — | — | — | — |
| Budget enforcement | ACID locks | — | — | — | — |
| Framework agnostic | Any LLM, any SDK | NVIDIA only | Python only | Meta only | Yes |
| Tool calling governance | Schema + args + output | Prompt-level | Basic validation | Content only | No AI semantics |
| Latency | 75µs p99 (benchmarked) | 100ms+ | 50ms+ | 200ms+ | < 5ms |
Integrations
Python (OpenAI SDK)
import openai
client = openai.OpenAI(base_url="http://localhost:8080/v1")
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "List files in /tmp"}]
)
# X-Helm-Decision-ID: dec_a1b2c3...
# X-Helm-Verdict: ALLOW
→ examples/python_openai_baseurl/main.py
TypeScript
const response = await fetch("http://localhost:8080/v1/chat/completions", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
model: "gpt-4",
messages: [{ role: "user", content: "What time is it?" }],
}),
});
// X-Helm-Decision-ID: dec_d4e5f6...
// X-Helm-Verdict: ALLOW
→ examples/js_openai_baseurl/main.js
MCP Interceptor
# Discover governed tools
curl -s http://localhost:8080/mcp/v1/capabilities | jq '.tools[].name'
# Execute with governance
curl -s -X POST http://localhost:8080/mcp/v1/execute \
-H 'Content-Type: application/json' \
-d '{"method":"file_read","params":{"path":"/tmp/test.txt"}}' | jq .
# → { "result": ..., "receipt_id": "rec_...", "reason_code": "ALLOW" }
Agent Runtimes
| Runtime | Quickstart | Time |
|---|---|---|
| DeerFlow | examples/deerflow/ |
5 min |
| OpenClaw | examples/openclaw/ |
5 min |
| Any OpenAI-compatible | Change base_url |
2 min |
MCP Client Install
# Claude Desktop
helm mcp pack --client claude-desktop --out helm.mcpb
# Claude Code
helm mcp install --client claude-code
# VS Code / Cursor / Windsurf
helm mcp print-config --client windsurf
CI Integration
# .github/workflows/ci.yml
jobs:
helm-check:
uses: Mindburn-Labs/helm-oss/.github/workflows/boundary-checks.yml@main
with:
level: L2
SDKs
HELM works with your existing SDK first. Point any OpenAI-compatible client at the HELM proxy and you have governed tool calling with zero code changes. Native SDKs are there when you want tighter integration.
→ Insertion Guide — three copy-paste paths to get started.
Generated from api/openapi/helm.openapi.yaml.
| Language | Package | Version | Status | Path |
|---|---|---|---|---|
| TypeScript | @mindburn/helm |
0.3.0 | Runtime/client SDK | sdk/ts/ |
| TypeScript | @mindburn/helm-cli |
0.3.0 | Verifier CLI | packages/mindburn-helm-cli/ |
| Python | helm-sdk |
0.3.0 | In-repo | sdk/python/ |
| Go | github.com/Mindburn-Labs/helm-oss/sdk/go |
0.3.0 | In-repo | sdk/go/ |
| Rust | helm-sdk |
0.3.0 | Preview | sdk/rust/ |
c := helm.New("http://localhost:8080")
res, err := c.ChatCompletions(helm.ChatCompletionRequest{
Model: "gpt-4",
Messages: []helm.ChatMessage{{Role: "user", Content: "List /tmp"}},
})
if apiErr, ok := err.(*helm.HelmApiError); ok {
fmt.Println("Denied:", apiErr.ReasonCode) // DENY_TOOL_NOT_FOUND
}
What Ships
| Capability |
|---|
| OpenAI-compatible governed proxy |
| Fail-closed MCP interceptor |
| Schema PEP (input + output validation) |
| ProofGraph DAG (Lamport + Ed25519) |
| WASI sandbox (gas/time/memory budgets) |
| Approval ceremonies (timelock + challenge) |
| Trust registry (event-sourced) |
| EvidencePack export + offline replay |
| Proof Condensation (Merkle checkpoints) |
| CPI (Canonical Policy Index) |
| HSM signing (Ed25519) |
| Policy Bundles (load, verify, compose) |
| Capability manifests + connector bundles |
| Conformance L1 + L2 |
| 20+ CLI commands |
Not included: managed federation, pack entitlement, compliance intelligence, Studio, managed control plane. See docs/OSS_SCOPE.md.
Security
- TCB isolation — 8-package kernel boundary, CI-enforced forbidden imports (TCB Policy)
- Bounded compute — WASI sandbox with gas/time/memory caps, deterministic traps (UC-005)
- Schema enforcement — JCS canonicalization + SHA-256 on every tool call, input and output (UC-002)
- Three-layer model — surface containment + dispatch enforcement + verifiable receipts (Execution Security Model)
SECURITY.md · Threat Model · OWASP MCP Mapping
Build & Test
make test # 115 packages
make crucible # 12 use cases + conformance L1/L2
make lint # go vet
Deploy
docker compose up -d # local
docker compose -f docker-compose.demo.yml up -d # production
Project Structure
helm-oss/
├── api/openapi/ # OpenAPI 3.1 spec (source of truth for SDKs)
├── core/ # Go kernel (8-package TCB + executor + ProofGraph)
│ └── cmd/helm/ # CLI: proxy, export, verify, replay, conform
├── packages/
│ └── mindburn-helm-cli/ # @mindburn/helm-cli (npm verifier)
├── sdk/ # TypeScript, Python, Go, Rust
├── examples/ # Runnable per-language + MCP examples
├── deploy/ # Caddy, compose, deploy guide
├── docs/ # Threat model, security model, conformance
└── Makefile # build, test, crucible, release-binaries
Contributing
See CONTRIBUTING.md.
License
Built by Mindburn Labs — applied research for execution security in autonomous systems.
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found