felix
Health Warn
- No license — Repository has no license file
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Low visibility — Only 6 GitHub stars
Code Pass
- Code scan — Scanned 1 files during light audit, no dangerous patterns found
Permissions Pass
- Permissions — No dangerous permissions requested
This tool is a self-hosted AI agent gateway that connects to multiple LLM providers via CLI or web chat. It processes AI interactions and allows agents to execute tasks entirely on your local hardware.
Security Assessment
Overall risk: Medium. The tool operates entirely on your hardware and accesses local files, executes shell commands, and makes web requests on behalf of the AI agents. However, it is designed with a strong "secure by default" philosophy: it restricts file access to a specific agent workspace, runs shell commands in a restricted allowlist mode, blocks web requests to internal IPs, and binds exclusively to localhost. No hardcoded secrets were found during the code scan, and it does not request dangerous system permissions.
Quality Assessment
The project is highly active, with its most recent code push occurring today. Despite this active maintenance, it currently has very low community visibility with only 6 GitHub stars, meaning it has not been broadly tested or vetted by a wide audience. Additionally, the repository lacks a license file, which is a significant concern for developers because it legally restricts how the code can be used, modified, or distributed.
Verdict
Use with caution — it features strong built-in security and active maintenance, but the lack of a license and minimal community validation warrant hesitation before relying on it for critical tasks.
Felix — single-binary AI agent gateway. Multi-provider LLM, persistent memory, MCP client, runs entirely on your hardware

Felix
A self-hosted AI agent gateway written in Go. Single binary, low memory, runs entirely on your own machine.
Felix connects you (via CLI or web chat) to LLMs — Claude, GPT, Gemini, Qwen, Ollama, or any OpenAI-compatible endpoint — and lets agents execute tasks on your hardware using a fixed registry of in-process tools plus any number of remote MCP servers.
Design philosophy
- Self-sufficient. One binary, one directory of state, no required network dependency. The LLM can be local. The vector index is in-process. The knowledge graph is a SQLite file. There is no Felix cloud, no Felix account, no Felix backend that anyone could turn off.
- Robust. Long-running agents touch files, shell out, talk to flaky APIs, and accumulate state across restarts. Every external call has a timeout. Every queue has a cap. Every per-call resource has a paired cleanup. On-disk state heals itself on the next load.
- Usable out of the box by non-technical people. The default install — no config edits, no API keys, no
vim— must just work. Advanced configuration can be as complex as it needs to be, but it must not be in the way of the default path. - Secure by default. An agent that can read files, run shell commands, and make web requests is genuinely powerful — defaults have to protect users who won't read the security docs. Felix binds to localhost only, ships the bash tool in allowlist mode rather than full shell access, blocks web requests to internal IP ranges and cloud metadata endpoints, contains file access to each agent's workspace with symlink resolution, and writes config and session files with owner-only permissions. You can relax any of it deliberately; you don't have to opt out of it.
Features
Interfaces
- Single binary, no runtime dependencies. Download and run.
- macOS / Windows system tray app that runs the gateway in the background and serves a web chat at
http://127.0.0.1:18789/chat. felix chatCLI that auto-detects a running gateway and shares its session, memory, and MCP state — start in the browser, continue in the terminal.- WebSocket JSON-RPC 2.0 control plane for programmatic access.
Models
- Claude, GPT, Gemini, Qwen, Ollama, LM Studio, DeepSeek, or any OpenAI-compatible API.
- Bundled Ollama runtime so you can run agents with no API key. Downloads
gemma4on first startup if no other models are available. - Per-agent extended reasoning (
off|low|medium|high) mapped to Claude thinking budgets, OpenAIreasoning_effort, GeminiThinkingConfig, and Qwenenable_thinking. - Context-window auto-detection from the model identifier (handles proxy prefixes correctly), with a per-agent
contextWindowoverride for unusual fine-tunes. - Cross-provider tool portability: JSON Schema fields one provider rejects (Gemini drops
anyOf/oneOf/format; OpenAI drops$ref/definitions) are stripped at the provider boundary.
Memory & knowledge
- Persistent memory: BM25 lexical search over Markdown files, recalled automatically each turn. Optional vector search via
chromem-gowhen an embedding provider is configured. - Cortex knowledge graph (SQLite) that ingests completed conversations and surfaces relevant facts on subsequent turns.
- Skill system: Markdown files with YAML frontmatter, lazily loaded by the agent on demand from a system-prompt index. Bundled starters (
ffmpeg,imagemagick,pandoc,pdftotext,cortex) seeded on first run; user skills are managed live from the Settings UI.
Agents & tools
- Multiple agents per install, each with its own model, workspace, persona, and tool policy.
- Subagents invocable via the
tasktool, so a supervisor can delegate to a specialist with a different model. - Per-agent allow/deny lists for every built-in and MCP-provided tool.
- Vision: paste image paths in the CLI or drop them in web chat; bytes go straight to the model.
- Cron jobs: recurring prompts on configurable intervals, with pause/resume/remove management.
- MCP client: Streamable-HTTP and stdio transports, OAuth2 (client credentials, authorization code + PKCE) and bearer auth, in-chat re-authentication, per-server circuit breaker.
Robustness
- Append-only JSONL session storage with a DAG view; compaction is splice-based and never destructive.
- Smart compaction: token-threshold or message-count triggered, three-stage fallback chain, per-session circuit breaker, runs asynchronously between turns.
- Cache-stability invariant: request prefixes are byte-stable across turns (sorted tool defs, deterministic schema normalization) so Anthropic and OpenAI prompt caches keep hitting.
- Stream-failure resilience: when a streaming response dies mid-flight, the runtime discards the partial output and retries via the provider's non-streaming endpoint, preserving the byte-identical prompt prefix.
- Config hot-reload: edit
felix.json5while running, changes apply immediately.
Operations
- All state in
~/.felix/as plain files. No external database. - OpenTelemetry export (opt-in): traces, metrics, and logs to any OTLP/HTTP collector via config or standard
OTEL_*env vars. - Localhost-only by default; optional bearer token auth on all HTTP and WebSocket endpoints.
Install
macOS — signed .pkg (recommended)
Download the latest Felix-vX.Y.Z-signed.pkg from the GitHub Releases page. Signed with Developer ID and notarized by Apple, so Gatekeeper accepts it.
The installer drops Felix.app into /Applications, bundles the felix and felix-app binaries plus a copy of ollama, seeds the bundled starter skills, and symlinks the CLI at /usr/local/bin/felix so felix chat / felix doctor work in any terminal.
On first launch, Felix.app opens http://127.0.0.1:18789/settings#models and starts pulling gemma4 (~9.6 GB chat model) and nomic-embed-text (~270 MB embeddings) in the background. Once the chat model is on disk, click Chat in the menu bar to start talking. Zero config, no API keys.
To uninstall: rm /usr/local/bin/felix && rm -rf /Applications/Felix.app ~/.felix/.
Build from source (Linux, Windows, or developers)
make build # CLI binary -> ./felix
make build-app # macOS menu bar app -> Felix.app (also bundles felix and ollama)
make build-app-windows # Windows system tray app -> felix-app.exe
Then run ./felix onboard to walk through provider setup. If you skip every cloud provider, the wizard configures the bundled Ollama with gemma4 so you have a working agent with zero credentials.
First steps
./felix onboard # First-time setup wizard (writes ~/.felix/felix.json5)
./felix chat # Interactive CLI chat
./felix start # Run the gateway (web chat at http://127.0.0.1:18789/chat)
open Felix.app # macOS tray launcher (auto-spawns the gateway)
./felix doctor # Sanity-checks config, data dirs, API keys, workspaces
./felix chat automatically detects a running gateway and shares its session state with the web chat. Pass --no-gateway to force an isolated in-process runtime; pass -m provider/model to override the model for one session (also forces in-process).
Inside felix chat, slash commands manage sessions and screenshots:
> /sessions # list sessions for this agent
> /new myproject # start a new named session
> /switch myproject # switch to an existing session
> /compact # manually compact the active session
> /screenshot # capture a window and analyze it (in-process mode only)
> /quit
Image attachments work too — type a file path in the message:
> What's in this image? ~/Downloads/photo.png
> Describe '/Users/me/My Photos/vacation.png'
> Tell me about this /Users/me/My\ Photos/diagram.png
Supported formats: .jpg, .jpeg, .png, .gif, .webp, .bmp (max 10 MB).
CLI commands
| Command | Description |
|---|---|
felix onboard |
Interactive setup wizard |
felix start |
Start the gateway server |
felix start -c path/to/config.json5 |
Custom config |
felix chat [agent] |
Interactive CLI chat (auto-detects running gateway) |
felix chat --no-gateway |
Force in-process runtime, ignore any running gateway |
felix chat -m provider/model |
Override model for this session (forces in-process) |
felix clear [agent] |
Clear local CLI session history |
felix sessions [agent] |
List sessions for an agent |
felix model list | pull <name> | rm <name> | status |
Manage local Ollama models |
felix mcp login <id> |
Run interactive OAuth login for an MCP server |
felix status |
Query the running gateway for agent status |
felix doctor |
Diagnostic checks |
felix version |
Print version + commit |
System tray app
A thin launcher around the gateway, on macOS and Windows. Spawns felix start as a separate child process so a tray reap (display sleep, fast user switching, memory pressure) doesn't take your active chat down — only the icon disappears, and relaunching reattaches via /health.
Menu: Chat, Jobs, Logs, Settings, Restart, Quit.
The Settings page has tabs for Agents, Providers, Models, Intelligence, Security, Messaging, MCP, Skills, Memory, and Gateway — most things you'd otherwise edit in felix.json5 are reachable here.
Web chat at /chat: agent + session selectors, streaming responses, light/dark toggle, inline tool-call display with collapsible output, inline "Re-authenticate" button when an MCP token expires, live trace panel.
Environment variables. macOS .app bundles don't inherit shell environment variables; Felix.app loads ~/.zshrc / ~/.bashrc at startup, so export ANTHROPIC_API_KEY=... works. On Windows, set via System Settings or PowerShell [System.Environment]::SetEnvironmentVariable("ANTHROPIC_API_KEY","sk-ant-...","User"). Either way, you can put keys directly in felix.json5 instead.
Configuration
~/.felix/felix.json5 (Windows: C:\Users\<you>\.felix\felix.json5). JSON5 means comments and trailing commas are allowed. Hot-reloaded — edits apply immediately, no restart needed.
Minimal config
{
"providers": {
"anthropic": { "kind": "anthropic", "api_key": "sk-ant-..." }
},
"agents": {
"list": [
{ "id": "default", "name": "Felix", "model": "anthropic/claude-sonnet-4-5" }
]
}
}
API keys via environment
Environment variables take precedence over config-file values. The convention is {PROVIDER}_API_KEY (or {PROVIDER}_AUTH_TOKEN) and {PROVIDER}_BASE_URL, where {PROVIDER} is the uppercased provider name from your config:
export ANTHROPIC_API_KEY="sk-ant-..."
export OPENAI_API_KEY="sk-proj-..."
export GEMINI_API_KEY="AIza..."
export DEEPSEEK_API_KEY="sk-..."
LLM providers
Felix supports multiple providers simultaneously. Each is defined in the providers block with a unique name and a kind:
| Kind | Description | Use for |
|---|---|---|
anthropic |
Anthropic's native API | Claude models |
openai |
OpenAI's native API | GPT models |
gemini |
Google's native Gemini SDK | Gemini models |
qwen |
Alibaba Cloud DashScope | Qwen models |
openai-compatible |
Anything implementing /v1/chat/completions |
Ollama, LM Studio, DeepSeek, LiteLLM, vLLM |
local |
Bundled Ollama supervised by Felix | Fully offline / no API key |
Per-provider setup
// Anthropic — get a key at https://console.anthropic.com
"anthropic": { "kind": "anthropic", "api_key": "sk-ant-..." }
// Models: claude-sonnet-4-5, claude-opus-4, claude-haiku-3-5
// OpenAI — get a key at https://platform.openai.com/api-keys
"openai": { "kind": "openai", "api_key": "sk-proj-..." }
// Models: gpt-5, gpt-5-mini, o3-mini
// Google Gemini — get a key at https://aistudio.google.com/apikey
"gemini": { "kind": "gemini", "api_key": "AIza..." }
// Models: gemini-2.5-flash, gemini-2.5-pro, gemini-2.0-flash
// Qwen (Alibaba)
"qwen": { "kind": "qwen", "api_key": "sk-..." }
// Models: qwen-plus, qwen-max, qwen-turbo
// External Ollama (running outside Felix)
"ollama": { "kind": "openai-compatible", "base_url": "http://localhost:11434/v1" }
// Models: any tag pulled into Ollama (llama3.2, qwen2.5, mistral, llava, ...)
// LM Studio (default port 1234)
"lmstudio": { "kind": "openai-compatible", "base_url": "http://localhost:1234/v1" }
// DeepSeek — get a key at https://platform.deepseek.com
"deepseek": {
"kind": "openai-compatible",
"api_key": "sk-...",
"base_url": "https://api.deepseek.com/v1"
}
// Models: deepseek-chat, deepseek-coder, deepseek-reasoner
// Bundled Ollama (wired up automatically by `felix onboard`)
"local": { "kind": "local", "base_url": "http://127.0.0.1:18790/v1" }
Model references
Agents reference models as provider/model-name where the provider name matches a key in the providers block:
anthropic/claude-sonnet-4-5 → uses the "anthropic" provider
openai/gpt-5 → uses the "openai" provider
ollama/llama3.2 → uses the "ollama" provider
local/gemma4 → uses the bundled local Ollama
deepseek/deepseek-chat → uses the "deepseek" provider
You can override the model for one CLI session: felix chat -m openai/gpt-5.
Reasoning levels
Per-agent reasoning: off|low|medium|high (default off). Maps to Anthropic thinking budgets, OpenAI reasoning_effort, Gemini ThinkingConfig, and Qwen enable_thinking. Models that don't support extended reasoning log reasoning ignored and proceed normally. Editable in the Settings UI's Agents tab.
Multiple agents
You can run multiple agents per install, each with its own model, workspace, system prompt, and tool policy:
{
"agents": {
"list": [
{
"id": "coder",
"name": "Coder",
"model": "anthropic/claude-sonnet-4-5",
"workspace": "~/code/myproject",
"tools": { "allow": ["read_file", "write_file", "edit_file", "bash"] }
},
{
"id": "researcher",
"name": "Researcher",
"model": "openai/gpt-5",
"workspace": "~/.felix/workspace-researcher",
"tools": { "allow": ["read_file", "web_fetch", "web_search"] }
},
{
"id": "local",
"name": "Local Assistant",
"model": "local/gemma4",
"workspace": "~/.felix/workspace-local",
"tools": { "allow": ["read_file"] }
}
]
}
}
Chat with a specific agent: felix chat coder (or pick from the dropdown in the web chat header).
Agent identity
Each agent's system prompt is resolved in this priority order:
system_promptfield in the agent's config (if non-empty)IDENTITY.mdin the agent's workspace directory- Built-in default — a generic helpful-assistant prompt, dynamically tailored to whichever tools the agent actually has (an agent without
web_searchwon't claim it can search the web)
Inline:
{
"id": "coder",
"system_prompt": "You are a senior Go developer. Idiomatic stdlib-first code. Always write tests."
}
Or in ~/.felix/workspace-coder/IDENTITY.md for prompts long enough to be uncomfortable in JSON.
Each agent automatically knows its own name and ID, what other agents exist (so it can suggest delegation), and which tools it has. No need to repeat any of that in the prompt.
Subagents and the task tool
A subagent is an agent another agent can delegate work to via the built-in task tool. The supervisor's LLM sees task like any other tool, picks a subagent by ID, and gets back the subagent's final text. The subagent runs with its own model, workspace, and tool policy.
Two flags wire it up:
subagent: trueon the target agent — the opt-in. Without this, the agent is invisible totask.taskin the supervisor'stools.allow— without this, the supervisor's LLM never sees the tool.
{
"agents": {
"list": [
// Supervisor — the agent you chat with
{
"id": "lead",
"name": "Tech Lead",
"model": "anthropic/claude-sonnet-4-5",
"tools": { "allow": ["read_file", "bash", "task", "todo_write"] }
},
// Subagent: cheap web research
{
"id": "researcher",
"model": "local/gemma4",
"subagent": true,
"description": "Searches the web and summarises sources. Returns a short bulleted brief with citations.",
"tools": { "allow": ["web_search", "web_fetch", "read_file"] }
},
// Subagent: read-only code review on a more careful model
{
"id": "reviewer",
"model": "anthropic/claude-opus-4",
"workspace": "~/code/myproject", // share the supervisor's workspace
"subagent": true,
"description": "Reviews code changes for correctness, security, and clarity.",
"inheritContext": true, // sees the supervisor's conversation
"tools": { "allow": ["read_file", "bash"] }
}
]
}
}
inheritContext: true loads the supervisor's conversation into the subagent's first turn. Useful for "review what I just did" patterns; expensive in tokens. Default is false (cold start with just the prompt).
Common gotchas:
tasknot in supervisor'stools.allow→ supervisor never sees the tool.subagent: truemissing on the target →taskreturnsagent X is not registered as a subagent.- Subagents can't themselves delegate (no
tasktool registered for them); a depth cap of 3 is enforced as defense-in-depth. - Multiple parallel
taskcalls aren't supported — they run sequentially.
Tools
Built-in tools the agent can use:
| Tool | Description |
|---|---|
read_file |
Read file contents (text + images for vision-capable models) |
write_file |
Create or overwrite files |
edit_file |
Targeted edits to existing files |
bash |
Execute shell commands (with deny / allowlist / full exec-approval levels) |
web_fetch |
Fetch a URL and return its content |
web_search |
Web search (DuckDuckGo by default; pluggable backend) |
browser |
Headless Chrome (navigate, click, type, screenshot, evaluate JS) |
cron |
Dynamically schedule, list, pause, resume, remove, update recurring tasks |
send_message |
Send outbound messages (currently Telegram via Bot API) |
todo_write |
Per-workspace persistent todo list for long, multi-stage work |
task |
Delegate a subtask to another configured agent |
load_skill |
Load a single skill body on demand by name |
load_memory |
Load a single memory entry body by id |
Tool access is per-agent allow/deny, configurable from the Settings UI's Agents tab. MCP-provided tools are wrapped to the same Tool interface and gated by the same allow/deny mechanism — the LLM can't tell the difference.
Tool policies
// Read-only agent (safe for untrusted use)
"tools": { "allow": ["read_file", "web_fetch", "web_search"] }
// Everything except shell
"tools": { "allow": ["*"], "deny": ["bash"] }
// Full default access
"tools": {
"allow": ["read_file", "write_file", "edit_file", "bash",
"web_fetch", "web_search", "browser"]
}
Bash exec policy
For additional safety, bash has its own command-level gate independent of the tool allowlist:
{
"security": {
"execApprovals": {
"level": "allowlist",
"allowlist": ["ls", "cat", "find", "grep", "head", "tail", "wc", "pwd", "date"]
}
}
}
Levels: full (default — anything goes), allowlist (only listed commands; shell metacharacters like $(...) and backticks are blocked), deny (no execution at all).
Browser tool
Headless Chrome automation; requires Chrome or Chromium installed on the host. Each invocation creates a fresh context, so cookies don't persist across calls — chain actions in one conversation turn for multi-step workflows.
| Action | Description |
|---|---|
navigate |
Navigate to a URL |
click |
Click an element by CSS selector |
type |
Type text into an input by CSS selector |
get_text |
Extract text from an element or full page |
screenshot |
Take a screenshot |
evaluate |
Run arbitrary JavaScript and return the result |
All actions except navigate accept an optional url parameter — provide it and the browser navigates first, all in one tool call.
Send message (Telegram outbound)
Felix can push messages to a Telegram chat via the Bot API. Outbound only — there is no inbound Telegram channel.
{
"telegram": {
"enabled": true,
"bot_token": "123456:ABC-DEF...", // from @BotFather
"default_chat_id": "123456789" // optional fallback
},
"agents": {
"list": [
{ "id": "default", "tools": { "allow": ["send_message"] } }
]
}
}
Useful for cron-driven alerts ("ping me when the build fails"). To find your chat ID, message your bot once and read the chat_id from the gateway log, or use any "userinfobot".
Dynamic cron
The cron tool lets the agent create, list, pause, resume, remove, and update recurring scheduled tasks at runtime — without editing the config:
You: Check disk usage every hour and warn me if it's above 80%.
Agent: [cron: action="add", name="disk-check", schedule="1h",
prompt="Check disk usage with 'df -h'. Alert me if any partition >80%."]
Done.
You: Pause the disk check job.
Agent: [cron: action="pause", name="disk-check"]
Static cron jobs go in agents[].cron[] in the config and persist across restarts. Dynamic jobs added via the tool also persist (~/.felix/cron-jobs.json). Both use the same scheduler. Schedule values are Go duration strings (30m, 1h, 24h).
MCP servers
Felix can connect to external Model Context Protocol servers and expose their tools alongside built-ins. Two transports: HTTP (Streamable HTTP, with OAuth2 / bearer / no auth) and stdio (Felix spawns the child process).
{
"mcp_servers": [
// OAuth2 client-credentials (machine-to-machine)
{
"id": "remote-tools",
"transport": "http",
"enabled": true,
"tool_prefix": "remote_",
"http": {
"url": "https://mcp.example.com/v1",
"auth": {
"kind": "oauth2_client_credentials",
"token_url": "https://auth.example.com/oauth/token",
"client_id": "felix-prod",
"client_secret_env": "REMOTE_MCP_SECRET",
"scope": "mcp.read mcp.write"
}
}
},
// OAuth2 authorization code + PKCE (user login)
// Initial login: `felix mcp login user-tools`
// Refresh: automatic, persisted to token_store_path
// Expiry mid-chat: inline "Re-authenticate" button in the web UI
{
"id": "user-tools",
"transport": "http",
"enabled": true,
"http": {
"url": "https://mcp.example.com/v1",
"auth": {
"kind": "oauth2_authorization_code",
"auth_url": "https://auth.example.com/oauth/authorize",
"token_url": "https://auth.example.com/oauth/token",
"client_id": "felix-cli",
"redirect_uri": "http://127.0.0.1:18800/callback",
"scope": "openid offline_access mcp.user",
"token_store_path": "~/.felix/mcp-tokens/user-tools.json"
}
}
},
// Stdio: Felix spawns the child process, inherits PATH
{
"id": "fs-tools",
"transport": "stdio",
"enabled": true,
"stdio": {
"command": "uvx",
"args": ["mcp-server-filesystem", "/Users/me/projects"]
}
}
]
}
Tools discovered from an MCP server are auto-added to agent allowlists at startup. Servers can also be edited from the Settings UI's MCP tab. A per-server circuit breaker stops calling a stuck upstream after 3 consecutive auth failures so the agent can't fall into a token-burning self-heal loop.
Skills
Skills are Markdown files with YAML frontmatter that get injected into the agent's context when relevant. They teach domain-specific knowledge without modifying code.
---
name: git-workflow
description: Guidelines for using git in this project
tags: [git, version-control, commit]
---
## Git Workflow
- Always create feature branches from `main`
- Use conventional commit messages: `feat:`, `fix:`, `docs:`, `refactor:`
- Run tests before committing
- Squash merge into main
Where they live:
~/.felix/skills/— shared across all agents<agent-workspace>/skills/— agent-specific
How they're matched. Felix injects only the index (name + description + tags) into every turn's system prompt. The model decides when it needs a skill body and calls the load_skill tool to fetch it on demand. This keeps the cached prompt prefix small and stable, which keeps prompt caches hitting.
Manage from the UI. Open /settings → Skills. Upload a .md file (256 KB max) and it's available on the next chat turn — no restart. View raw markdown, delete, or check for parse warnings inline. The same operations are exposed via REST at GET/POST/DELETE /settings/api/skills*.
The default install seeds cortex, ffmpeg, imagemagick, pandoc, and pdftotext so the agent arrives knowing how to reason about common command-line tools.
Memory
{ "memory": { "enabled": true } }
Memory entries are Markdown files in ~/.felix/memory/entries/. The agent can create, update, and delete entries during conversations, and BM25 search surfaces relevant ones each turn. The model sees only the index in the system prompt; bodies are loaded via the load_memory tool on demand (same lazy-hydration pattern as skills).
You can manually drop entries in too:
cat > ~/.felix/memory/entries/project-conventions.md << 'EOF'
# Project Conventions
- Go 1.22+ with generics where appropriate
- chi router for HTTP handlers
- testify/assert for tests
- Errors wrapped with fmt.Errorf("context: %w", err)
EOF
If you also configure an embeddingProvider and embeddingModel under memory, vector search via chromem-go runs alongside BM25.
Local LLM (bundled Ollama)
Felix ships with a bundled Ollama binary so you can run agents offline with no API key. On first run it pulls gemma4 (chat) and nomic-embed-text (memory embeddings) in the background.
felix model pull qwen2.5:7b # add another model
felix model list # see what's installed
felix model status # check the supervisor
felix model rm gemma4 # free disk space
The bundled Ollama runs as a child of Felix on a free port in 127.0.0.1:18790–18799 and shuts down when Felix exits. It does not interfere with any system Ollama you may have on :11434.
WebSocket API
JSON-RPC 2.0 over WebSocket at ws://127.0.0.1:18789/ws.
| Method | Description |
|---|---|
chat.send |
Send a message to an agent (streams response events) |
chat.abort |
Cancel the active response for this connection |
chat.compact |
Force-compact the active session immediately |
agent.status |
List all configured agents and their state |
session.list / new / switch / history / clear |
Session management |
jobs.list / add / pause / resume / remove / update |
Cron job management |
HTTP endpoints: GET /health, /ws, /metrics (when enabled), /chat, /jobs, /settings, /logs (+ SSE), POST /api/mcp/reauth/{id}.
const ws = new WebSocket("ws://127.0.0.1:18789/ws");
ws.onopen = () => {
ws.send(JSON.stringify({
jsonrpc: "2.0",
method: "chat.send",
params: { agentId: "default", text: "What files are in the current directory?" },
id: 1
}));
};
ws.onmessage = (event) => {
const msg = JSON.parse(event.data);
// msg.result.type is one of:
// "text_delta" — streaming text chunk
// "tool_call_start" — agent is calling a tool
// "tool_result" — tool execution result
// "compaction.start" / "compaction.done" / "compaction.skipped"
// "done" — response complete
// "error" / "aborted"
};
If gateway.auth.token is set in felix.json5, include Authorization: Bearer <token> on the WebSocket upgrade.
Observability
Logs. /logs shows the live tail of the gateway's structured logs (slog) with an SSE stream at /logs/stream. Tool inputs and outputs log at DEBUG, not INFO, to keep sensitive data out of casual viewing.
Metrics. /metrics exposes Prometheus-style metrics when enabled in config.
OpenTelemetry export. Opt-in OTLP/HTTP exporter for traces, metrics, and logs to any compatible collector (Tempo, Jaeger, Loki, Grafana Cloud, Honeycomb, your own collector). Each chat turn becomes one agent.run span with phase events for cortex.recall, context.assemble, llm.request_sent, llm.first_token, llm.stream_end, tool.exec, agent.done. Exporter init is non-fatal — Felix serves chat normally even if the collector is unreachable.
{
"otel": {
"enabled": true,
"endpoint": "http://collector.example.com:4318/",
"serviceName": "felix",
"sampleRatio": 1.0,
"signals": { "traces": true, "metrics": true, "logs": true }
}
}
Or via standard env vars (which implicitly enable OTel when OTEL_EXPORTER_OTLP_ENDPOINT is set):
OTEL_EXPORTER_OTLP_ENDPOINT="http://collector.example.com:4318/" \
OTEL_SERVICE_NAME="felix-prod" \
./felix start
OTel changes require a restart (the SDK doesn't support swapping providers in flight).
Security
Felix is designed to run on your own hardware. The defaults protect you from the common ways an agent with broad capabilities can go wrong; you relax them deliberately, not opt out of them.
Network & transport. Localhost-only by default (127.0.0.1:18789). Optional bearer-token auth on all HTTP and WebSocket endpoints with constant-time comparison. WebSocket origin checking restricted to localhost by default. 5 s ReadHeaderTimeout against slowloris. Web chat sets X-Frame-Options: DENY, Content-Security-Policy, X-Content-Type-Options: nosniff.
Tool execution. Per-agent allow/deny lists for every tool. Bash exec policy — deny (no shell), allowlist (only listed commands; shell metacharacters blocked), or full (default). File tools validate paths against the agent's workspace with symlink resolution to prevent traversal.
Input validation. web_fetch and browser resolve hostnames and block private IP ranges (RFC 1918, loopback, link-local, IPv6 ULA) and cloud metadata endpoints, fail-closed on DNS errors, re-validate redirects. The web chat escapes HTML before applying markdown and blocks javascript: / data: / vbscript: URLs. WebSocket per-connection rate limit (30 msg/sec) and 1 MiB message cap.
Credentials & data. No hardcoded secrets. onboard writes config with 0o600; warning logged at startup if it's group/world-readable. Session files are 0o600. Tool inputs/outputs (which may contain sensitive data) log at DEBUG, not INFO.
Optional bearer auth.
{ "gateway": { "auth": { "token": "my-secret-token" } } }
WebSocket clients then need Authorization: Bearer my-secret-token on the upgrade.
Example configurations
Personal assistant (Claude + Telegram alerts)
{
"providers": {
"anthropic": { "kind": "anthropic", "api_key": "sk-ant-..." }
},
"agents": {
"list": [
{
"id": "assistant",
"name": "Personal Assistant",
"model": "anthropic/claude-sonnet-4-5",
"workspace": "~/.felix/workspace-assistant",
"tools": {
"allow": ["read_file", "write_file", "edit_file", "bash",
"web_fetch", "web_search", "send_message", "cron"]
}
}
]
},
"telegram": { "enabled": true, "bot_token": "...", "default_chat_id": "..." },
"memory": { "enabled": true },
"cortex": { "enabled": true }
}
Multi-agent dev team (supervisor + delegated workers)
{
"providers": {
"anthropic": { "kind": "anthropic", "api_key": "sk-ant-..." },
"openai": { "kind": "openai", "api_key": "sk-..." },
"local": { "kind": "local" }
},
"agents": {
"list": [
{
"id": "lead",
"name": "Tech Lead",
"model": "anthropic/claude-sonnet-4-5",
"tools": { "allow": ["read_file", "bash", "task", "todo_write"] }
},
{
"id": "coder",
"model": "anthropic/claude-sonnet-4-5",
"subagent": true,
"description": "Writes and edits code. Use for any 'implement X' or 'refactor Y' task.",
"tools": { "allow": ["read_file", "write_file", "edit_file", "bash"] }
},
{
"id": "reviewer",
"model": "openai/gpt-5",
"subagent": true,
"inheritContext": true,
"description": "Reviews code changes for correctness and style. Read-only.",
"tools": { "allow": ["read_file"] }
},
{
"id": "quick",
"model": "local/gemma4",
"subagent": true,
"description": "Fast local lookups: man pages, syntax checks, single-file reads.",
"tools": { "allow": ["read_file", "web_search"] }
}
]
}
}
Locked-down read-only (safe for shared use)
{
"providers": {
"anthropic": { "kind": "anthropic", "api_key": "sk-ant-..." }
},
"agents": {
"list": [
{ "id": "safe", "model": "anthropic/claude-sonnet-4-5",
"tools": { "allow": ["read_file"] } }
]
},
"security": { "execApprovals": { "level": "deny" } },
"gateway": { "auth": { "token": "my-secret-token" } }
}
Fully offline (bundled Ollama, no API keys)
{
"providers": {
"local": { "kind": "local", "base_url": "http://127.0.0.1:18790/v1" }
},
"agents": {
"list": [
{ "id": "default", "model": "local/gemma4",
"tools": { "allow": ["read_file", "write_file", "bash"] } }
]
},
"memory": { "enabled": true }
}
Architecture
Single-process, hub-and-spoke. All components run in one binary.
- Gateway —
chiHTTP router +gorilla/websocketon:18789. - Agent runtime — assemble static + dynamic system prompt, stream LLM response, partition tool calls into parallel batches, dispatch and re-loop.
- LLM provider abstraction — one
LLMProviderinterface, six implementations (anthropic,openai,gemini,qwen,local,openai-compatible). - Session manager — append-only JSONL with a DAG view; splice-based compaction.
- Memory manager — BM25 always present, vector search optional via
chromem-go. - Cortex — optional SQLite knowledge graph for cross-conversation recall.
- Skill loader — embedded starters + user skills, hot-reloaded.
- Compaction manager — three-stage fallback, per-session circuit breaker, async between-turns.
- MCP manager — Streamable-HTTP and stdio clients with OAuth2 / bearer auth and in-process re-authentication.
- Cron — recurring prompts on schedules, with pause/resume/remove management.
- Bundled Ollama supervisor — keeps a local LLM available without external setup.
Data directory
All state lives in ~/.felix/ (Windows: C:\Users\<you>\.felix\):
felix.json5 # Configuration
sessions/ # Conversation history (JSONL, one file per agent+session)
memory/entries/ # Memory entries (Markdown)
skills/ # User skills (SKILL.md); bundled starters seeded on first run
workspace-<agentId>/ # Per-agent workspace (IDENTITY.md, FELIX.md, AGENTS.md, skills/)
brain.db # Cortex knowledge graph (SQLite)
cron-jobs.json # Persisted dynamic cron jobs
mcp-tokens/ # OAuth refresh tokens per MCP server
ollama/ # Bundled Ollama model store
Plain files. Inspect with a text editor; back up with rsync; copy to another machine and pick up exactly where you left off.
Development
make build # CLI binary
make build-app # macOS menu bar app (Felix.app)
make build-app-windows # Windows system tray app (felix-app.exe)
make test # Run all tests
make test-race # With race detector
make lint # golangci-lint
make sign # Sign + notarize + staple the macOS PKG -> dist/Felix-<VERSION>-signed.pkg
make publish-release # Publish a GitHub release for the latest tag
make build-release # Cross-compile binaries for all platforms
make help # All targets
Key dependencies
| Purpose | Package |
|---|---|
| HTTP router | github.com/go-chi/chi/v5 |
| WebSocket | github.com/gorilla/websocket |
| CLI framework | github.com/spf13/cobra |
| Anthropic | github.com/anthropics/anthropic-sdk-go |
| OpenAI / Qwen / local Ollama | github.com/sashabaranov/go-openai |
| Gemini | google.golang.org/genai |
| MCP client | github.com/modelcontextprotocol/go-sdk |
| Knowledge graph | github.com/sausheong/cortex |
| Vector index | github.com/philippgille/chromem-go |
| HTML → Markdown | github.com/JohannesKaufmann/html-to-markdown/v2 |
| Markdown rendering (CLI) | github.com/charmbracelet/glamour |
| Browser automation | github.com/chromedp/chromedp |
| OAuth2 (MCP auth) | golang.org/x/oauth2 |
| System tray | fyne.io/systray |
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found