hecate

mcp
Security Audit
Fail
Health Pass
  • License — License: MIT
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Community trust — 10 GitHub stars
Code Fail
  • rm -rf — Recursive force deletion command in .claude/settings.json
  • spawnSync — Synchronous process spawning in scripts/capture-screenshots.ts
  • process.env — Environment variable access in scripts/capture-screenshots.ts
  • network request — Outbound network request in scripts/capture-screenshots.ts
Permissions Pass
  • Permissions — No dangerous permissions requested
Purpose
This tool is an open-source AI gateway and agent-task runtime that acts as a central control plane. It sits between AI clients and model providers to manage routing, spend controls, and task execution, while offering observability and multi-tenant management.

Security Assessment
The application inherently manages sensitive data like API keys and environment variables to route AI requests, and it makes outbound network requests to external model providers. However, several specific code patterns raised flags during the audit. A recursive force deletion command (`rm -rf`) was found in a configuration file, and a script uses synchronous process spawning. Additionally, that same script accesses environment variables and makes network requests. There are no hardcoded secrets, and the tool does not request broadly dangerous permissions. Overall risk is rated as Medium.

Quality Assessment
The project is actively maintained, with its most recent push occurring today. It uses the highly permissive MIT license and features a robust README with clear documentation, badges for testing, and Go report card integration. However, community trust and adoption are currently quite low, as evidenced by only having 10 GitHub stars.

Verdict
Use with caution: the tool is active and properly licensed, but its low community adoption and the presence of potentially risky commands in its scripts warrant a careful security review before deploying in production.
SUMMARY

Open-source AI gateway and agent-task runtime that gives teams one operational control plane across cloud and local models, with built-in policy, spend controls, and first-class OpenTelemetry.

README.md

Hecate

Test
Go Report Card
Go version
License
OpenTelemetry

Hecate is an open-source AI gateway and agent-task runtime for teams that want one control plane for model access, cost governance, routing, caching, observability, and controlled agent execution.

It sits between AI clients and model providers. Existing OpenAI-compatible and Anthropic-compatible clients can point at Hecate, while operators get a place to manage providers, costs, traces, cache behavior, and queued agent work. Multi-tenant management is opt-in — the default deployment is a single-user gateway with one admin bearer.

Table Of Contents

Why Hecate

AI workloads are moving from simple API calls to long-running agents, tool use, local/cloud routing, and budget-sensitive automation. Hecate is built for that messier runtime layer.

  • One gateway for many clients — OpenAI Chat Completions and Anthropic Messages shapes.
  • Local and cloud providers together — OpenAI, Anthropic, Ollama, LM Studio, LocalAI, llama.cpp-compatible servers, and other shipped presets.
  • Operator-controlled spend — balances, pricebook management, rate limits, audit history, and (opt-in) per-tenant API keys with model/provider scoping.
  • Runtime visibility — request ledger, route reports, failover details, cost, cache path, trace IDs, and OpenTelemetry export.
  • Agent-task runtime — queued tasks, approvals, controlled shell/file/git execution, resumable runs, and MCP integration.
  • Single binary deploy — Go gateway with the React operator UI embedded via go:embed. One process, one port, one volume; no separate frontend service to run.

Modes

Hecate runs in one of two modes. The flag flips at startup; you can switch between runs without losing state.

Single-user (default) Multi-tenant (opt-in)
Flag GATEWAY_MULTI_TENANT=false GATEWAY_MULTI_TENANT=true
Auth One admin bearer; loopback handshake auto-fills it for same-host browsers. Admin bearer plus per-tenant API keys, each scoped to allowed providers and models.
Operator UI Chats, Providers, Tasks, Observability, Costs, Settings (Pricing / Policy / Retention). Same plus the Tenants and Keys tabs in Settings.
Observability Admin sees everything; tenants see nothing because there are no tenants. Tenants see their own traces / requests / runtime stats via /v1/* mirrors of the /admin/* endpoints.
Use when One operator on one host; local dev; a personal gateway behind a single key. Multiple consumers, per-key audit, scoped credentials.

The published Docker image ships single-user. Full breakdown in docs/tenants.md.

Quick Start

Single-user path; for multi-tenant see docs/tenants.md.

1. Run the image

docker run --rm -p 8765:8765 -v hecate-data:/data \
  ghcr.io/chicoxyzzy/hecate:0.1.0-alpha.7

2. Open the UI

Open http://127.0.0.1:8765. On a localhost browser the console picks up the generated admin bearer through a same-origin loopback handshake — no token paste needed.

3. Add a provider

The Providers tab starts empty. Click Add provider, pick a preset (or Custom for any OpenAI-compatible endpoint), and paste an API key (cloud) or endpoint URL (local).

Empty Providers tab on first boot — Add provider CTA

Add provider modal on the Cloud tab — preset catalog

Providers table populated with three providers — Health, Endpoint, Credentials, Models

Cloud presets need an API key; local presets just need the runtime listening on its default port. Full catalog, custom-endpoint walk-through, and credential rotation in docs/providers.md.

4. Talk to it

Chats workspace talking to a local Ollama llama3.1:8b model with sessions sidebar and inline runtime metadata

Remote browsers, reverse proxies, and cross-origin setups

The loopback handshake only fires for same-host browsers. Anywhere else (Tailscale, port-forward over SSH, reverse proxy with a different hostname) the UI shows a token paste prompt:

Token paste prompt — remote / cross-origin browsers

The bootstrap token is printed once to the container logs:

============================================================
  Hecate first-run setup — admin bearer token generated.

    7f2a91b... (truncated)

  Saved to /data/hecate.bootstrap.json (mode 0600).
============================================================

It also lives in hecate.bootstrap.json on the hecate-data volume — recovery instructions in docs/deployment.md.

Other install paths (clone, Postgres profile, source build, env-as-code)

Cloning the repo lets you pick up optional compose profiles or rebuild from source:

docker compose up                    # uses the ghcr.io image; first run pulls
docker compose --profile postgres up # adds Postgres for durable state across all subsystems

Local development:

make dev

Pinned image tags, single-file binaries (linux/darwin × amd64/arm64), and checksums are in docs/deployment.md. Local development knobs in docs/development.md.

Provider keys can also be pre-seeded via .env for fleet automation — PROVIDER_<NAME>_API_KEY, _BASE_URL, _DEFAULT_MODEL, plus the _PRECONFIGURED=1 gate. See docs/providers.md. The /admin/control-plane/providers endpoints mirror every UI action for programmatic management.

Architecture

One Go process, one port. Inside it: a chat/messages gateway that mediates client traffic to upstream providers, and a task runtime that queues agent work, drives approvals, and shells out through a sandbox boundary. The React operator UI is embedded into the same binary and served from the same port.

flowchart LR
    Clients["Clients<br/>Codex, Claude Code, SDKs"]
    Browser["Browser<br/>(operator UI)"]

    subgraph Hecate["Hecate (single binary, :8765)"]
        direction TB
        Gateway["Gateway<br/>/v1/chat/completions<br/>/v1/messages<br/>/v1/models"]
        Runtime["Task runtime<br/>/v1/tasks/*<br/>queue + workers + sandbox"]
        UI["Embedded UI<br/>(go:embed ui/dist)"]
    end

    Clients --> Gateway
    Clients --> Runtime
    Browser --> UI
    UI --> Gateway
    UI --> Runtime

    Gateway --> Providers["Cloud + local providers"]
    Gateway --> Cache["Exact + semantic cache"]
    Runtime --> Sandbox["sandboxd<br/>(out-of-process exec)"]
    Runtime --> MCP["External MCP servers"]
    Gateway --> OTel["OpenTelemetry"]
    Runtime --> OTel

For deeper internals, read docs/architecture.md, docs/runtime-api.md, and docs/events.md.

Operator UI

The embedded UI is a runtime console for operators.

  • Chats — send requests through Hecate, choose provider/model, inspect per-turn route/cost/cache metadata.
  • Providers — manage provider credentials, defaults, model discovery, base URLs, and health.
  • Tasks — create and manage agent runs, approvals, retries, resumes, and streamed output.
  • Observability — inspect requests, route candidates, skip reasons, failover, costs, cache decisions, and trace events.
  • Costs — balance, top-up / reset, usage table.
  • Settings — pricebook, policy rules, retention, and (when GATEWAY_MULTI_TENANT=true) tenants + API keys.
Various UI screenshots

Observability view — request ledger and route-report drilldown

Tasks workspace — task list with run state and approval queue

Costs workspace — balance card and per-key usage table

Settings → Pricebook — model catalog with priced / unpriced / deprecated filters

What Works Today

Hecate is public-alpha software. The core gateway path is usable; the agent runtime and sandbox are intentionally still evolving.

Area State Notes
OpenAI-compatible gateway Usable Chat Completions, streaming, vision, model discovery
Anthropic-compatible gateway Usable Messages API shape, streaming translation, Claude Code support
Provider catalog Usable Built-in presets, encrypted credentials, health, routing readiness
Local providers Usable Ollama, LM Studio, LocalAI, llama.cpp-compatible servers
Auth Usable Admin bearer with same-origin loopback handshake; GATEWAY_AUTH_DISABLED for upstream-terminated auth
Tenants and API keys Opt-in GATEWAY_MULTI_TENANT=true exposes tenant + key management with provider/model scoping
Budgets and rate limits Usable Balances, warning thresholds, pricebook, 429 rate-limit headers
Caching Usable Exact cache; semantic cache is available but still early
OpenTelemetry Usable OTLP traces, metrics, logs, response headers, local trace view
Storage tiers Usable Memory, SQLite, Postgres, selected per subsystem
Operator UI Usable Main workflows are present; debugging ergonomics are still improving
Agent task runtime Alpha Queues, approvals, resumable runs, agent_loop, MCP integration
Execution isolation Alpha sandboxd boundary exists; stronger OS-level isolation is future work

Read docs/known-limitations.md before treating Hecate as production-stable.

Configuration

The README intentionally stays light on configuration. The source of truth is:

Documentation

Browse the full index at docs/README.md. Highlights:

Contributing

See CONTRIBUTING.md. If you work with an AI assistant, start with AGENTS.md; the vendor-neutral agent instruction layer lives in ai/.

License

MIT. See LICENSE.

Third-party data and software notices live in NOTICE.md, including LiteLLM pricing-data attribution.

Reviews (0)

No results found