router
Health Warn
- License — License: NOASSERTION
- No description — Repository has no description
- Active repo — Last push 0 days ago
- Low visibility — Only 6 GitHub stars
Code Warn
- process.env — Environment variable access in frontend/next.config.ts
Permissions Pass
- Permissions — No dangerous permissions requested
No AI report is available for this listing yet.
One endpoint. Every model. Always the right one.
A drop-in proxy for Anthropic, OpenAI, and Gemini that picks the best model
for every request: using a tiny on-box embedder, not a vibes-based prompt.
Built by Weave: The #1 engineering intelligence platform,
loved by Robinhood, PostHog, Reducto, and hundreds of others.
What it does
Point Claude Code, Cursor, or your own app at localhost:8080. The router:
- 🎯 Routes per request. A cluster scorer derived from
Avengers-Pro [^1] picks the right
model from your enabled providers, every turn. - 🔌 Speaks everyone's API. Anthropic Messages, OpenAI Chat Completions,
Gemini native. Streaming, tools, vision, the works. - 🧠 Knows OSS too. DeepSeek, Kimi, GLM, Qwen, Llama, Mistral via
OpenRouter (or any OpenAI-compatible endpoint). - 🔒 BYOK by default. Provider keys stay on your box, encrypted at rest.
- 📊 Observable. OTLP traces out of the box. See your dashboard in the Weave dashboard (http://localhost:8080/ui/dashboard) or drop in Honeycomb, Datadog,
Grafana, whatever.
30-second quickstart
The fastest way: point Claude Code at the hosted Weave Router with one
command. No clone, no Docker, no Postgres.
npx @workweave/router
That's it. The installer walks you through scope (user vs. project), grabs
a router key, and wires Claude Code. Other flavors:
npx @workweave/router --scope project # per-repo, commits settings.json
npx @workweave/router --local # self-hosted localhost:8080
npx @workweave/router --base-url https://router.acme.internal
npx @workweave/[email protected] # pin a version
Requires Node ≥ 18 and jq. Full flag reference: install/npm/README.md.
Or: self-host the whole stack
If you want the router (and dashboard) running on your own box:
# 1. Drop a provider key in. OpenRouter is the recommended baseline.
echo "OPENROUTER_API_KEY=sk-or-v1-..." >> .env.local
# 2. Boot Postgres + router on :8080 and seed an rk_ key.
make full-setup
The router is up at http://localhost:8080, the dashboard at
http://localhost:8080/ui/ (password: admin), and your rk_... key
prints in the logs.
# Call it like Anthropic
curl -sS http://localhost:8080/v1/messages \
-H "Authorization: Bearer rk_..." \
-d '{"model":"claude-sonnet-4-5","max_tokens":256,
"messages":[{"role":"user","content":"hi"}]}'
# ...or like OpenAI
curl -sS http://localhost:8080/v1/chat/completions \
-H "Authorization: Bearer rk_..." \
-d '{"model":"gpt-4o-mini",
"messages":[{"role":"user","content":"hi"}]}'
# Peek at the routing decision without proxying
curl -sS http://localhost:8080/v1/route -H "Authorization: Bearer rk_..." -d '...'
Wire it into your tools
Claude Code. Run make install-cc to wire Claude Code at the local
self-hosted router (it's also invoked automatically at the end ofmake full-setup). For the hosted router, use npx @workweave/router
above.
Cursor (early beta, performance may not be the best). Settings →
Models → Override OpenAI Base URL → http://localhost:8080/v1, pasterk_... as the API key.
Two keys, don't mix them up:
sk-or-.../sk-ant-.../sk-...= your upstream provider key. Lives in.env.local.rk_...= your router key. Clients send this as a Bearer token.
Endpoints
| Endpoint | Format |
|---|---|
POST /v1/messages |
Anthropic Messages, routed |
POST /v1/chat/completions |
OpenAI Chat Completions, routed |
POST /v1beta/models/:action |
Gemini generateContent, routed |
POST /v1/route |
Returns the decision, no upstream call |
GET /v1/models · POST /v1/messages/count_tokens |
Anthropic passthrough |
GET /health · GET /validate |
liveness + key check |
Deeper docs
- 📐 Configuration reference: every env var,
BYOK encryption, OTel knobs, cluster routing. - 🛠️ Contributing: layering rules, hot-reload dev,
migrations, tests, the whole engineering loop. - 🏗️ Architecture: package layout, import contracts,
recipes for adding endpoints / providers / strategies.
Roadmap
- Token-aware rate limiting (Redis sliding window per installation)
- Sub-installations for tenant hierarchies
- Speculative dispatch + hedging for tail latency
[^1]: Zhang, Y. et al. Beyond GPT-5: Making LLMs Cheaper and Better via
Performance–Efficiency Optimized Routing (Avengers-Pro).
arXiv:2508.12631, 2025. https://arxiv.org/abs/2508.12631
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found