Alvus
Health Warn
- License — License: MIT
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Low visibility — Only 6 GitHub stars
Code Pass
- Code scan — Scanned 1 files during light audit, no dangerous patterns found
Permissions Pass
- Permissions — No dangerous permissions requested
No AI report is available for this listing yet.
Zero-dependency Go proxy for rotating free-tier AI API keys. No more 429s.
⚡ Alvus
~5 MB binary. Zero dependencies. Zero 429s.
A lightweight Go proxy that silently absorbs rate limit errors and keeps your AI agent running.
The Problem
You're in the middle of an agentic session — OpenClaw is halfway through a task, Cline is on a roll, your agent is doing things — and then:
Error: 429 Too Many Requests
The loop breaks. Context is lost. You're staring at a spinner.
If you use free-tier providers like NVIDIA NIM, this happens constantly. Free keys cap around 40 RPM. One productive session burns through that in seconds.
The Solution
Alvus sits between your agent and the upstream API. You give it a pool of keys. It handles everything else — round-robin distribution, per-key cooldowns, automatic retries, streaming passthrough. Your agent never sees a 429.
Any OpenAI-compatible agent or IDE
│
▼
┌─────────────────────┐
│ Alvus │ ← localhost:3000
│ │
│ [key1] ✅ ready │
│ [key2] ✅ ready │ ──→ NVIDIA NIM / any OpenAI-compatible API
│ [key3] ❄️ cooling │
└─────────────────────┘
3 keys × 40 RPM = 120+ effective RPM. The math is simple. The setup is simpler.
Idle RAM usage: ~2 MB. Alvus is a single static binary with no runtime. It won't compete with your models for memory.
Works With Everything
If it speaks OpenAI-compatible API, it works with Alvus.
| Tool | Type | Setup |
|---|---|---|
| OpenClaw | AI agent | Set base URL in provider config |
| PicoClaw | Lightweight agent | Set api_base in config.json |
| Nanobot | Lightweight agent | Set api_base in config.yaml |
| Cline | VS Code agent | OpenAI Compatible provider |
| Cursor | IDE | Base URL override in settings |
| Aider | CLI agent | --openai-api-base flag |
| Any OpenAI-compatible client | — | Point at http://localhost:3000/v1 |
Features
| 🔑 Key pool | Multiple keys, one endpoint. Distribute load transparently |
| 🔄 Round-robin | Even distribution across all healthy keys |
| 🚫 Silent retry on 429/502/503 | Failed key enters cooldown, request retries instantly with the next |
| ⏱️ Retry-After support | Respects upstream Retry-After headers — no blind fixed waits |
| 🔑 Auto-disable on 401/403 | Invalid or revoked keys are permanently removed from the pool |
| 📡 Streaming passthrough | SSE and chunked responses piped with zero buffering overhead |
| ❤️ Health endpoint | GET /health shows live key status, cooldown timers, and requests/minute |
| 🖥️ Interactive Dashboard | GET /dashboard — Premium Glassmorphism Dark UI for real-time monitoring |
| ⚡ Live Activity Logs | Searchable, 1000-entry memory cache to track all request activity |
| 🔧 Dynamic Configuration | Update keys and base URLs directly from the dashboard; writes to .env |
| 🪶 Zero dependencies | Pure Go stdlib. One file. One binary |
🔧 .env support |
Built-in parser — no godotenv, no extras |
| 🖥️ Runs anywhere | linux/amd64, arm64, arm, 386 — including Pi Zero and older x86 hardware |
| 💾 ~2 MB idle RAM | Static binary, no runtime, won't compete with your models for memory |
Quickstart
1. Get the binary
Build from source (requires Go 1.21+):
git clone https://github.com/YOUR_USERNAME/alvus.git
cd alvus
go build -o alvus main.go
Cross-compile for a remote server (e.g. Raspberry Pi Zero, 32-bit x86):
# Pi Zero / older ARM
GOOS=linux GOARCH=arm CGO_ENABLED=0 go build -o alvus main.go
# 32-bit x86 (Atom, old netbooks, salvaged hardware)
GOOS=linux GOARCH=386 CGO_ENABLED=0 go build -o alvus main.go
The binary is fully static — drop it on the machine and run it. No runtime, no dependencies, no install step.
Download a prebuilt release:
Go to Releases and grab the binary for your platform.
2. Configure
Create .env in the same directory as the binary:
# Your API keys, comma-separated
API_KEYS=nvapi-xxxxxxxxxxxx,nvapi-yyyyyyyyyyyy,nvapi-zzzzzzzzzzzz
# Port to listen on (default: 3000)
PORT=3000
# Upstream API base URL (default: NVIDIA NIM)
TARGET_BASE_URL=https://integrate.api.nvidia.com/v1
# Seconds to cool down a key after a 429, 502, or 503 (default: 60)
COOLDOWN_SEC=60
Real environment variables take precedence over .env — useful for systemd or containers.
3. Run
./alvus
⚡ Alvus started on :3000
Target : https://integrate.api.nvidia.com/v1
Keys : 3 loaded
Cooldown: 60s per key on 429/502/503
4. Point your agent at it
OpenClaw
{
"models": {
"providers": {
"nim": {
"baseUrl": "http://localhost:3000/v1",
"apiKey": "sk-proxy-dummy"
}
},
"defaults": {
"provider": "nim",
"model": "deepseek-ai/deepseek-r1"
}
}
}
PicoClaw / Nanobot
{
"model_name": "deepseek-r1",
"model": "openai/deepseek-ai/deepseek-r1",
"api_base": "http://localhost:3000/v1",
"api_keys": ["sk-proxy-dummy"]
}
Cline (VS Code)
| Setting | Value |
|---|---|
| API Provider | OpenAI Compatible |
| Base URL | http://localhost:3000/v1 |
| API Key | sk-proxy-dummy (any string) |
| Model ID | deepseek-ai/deepseek-r1 |
Cursor
Settings → Models → set base URL to http://localhost:3000/v1, any dummy key.
Aider
aider --openai-api-base http://localhost:3000/v1 --openai-api-key sk-dummy
How It Works
1. Request arrives from your agent or IDE
2. Body is buffered (needed for retry replay)
3. Round-robin picks the next available key
4. Request forwarded upstream with that key injected
│
├── ✅ 2xx/3xx → request count incremented, headers + body streamed back, done
├── ❄️ 429/502/503 → key enters cooldown, retry with next key
├── 🔑 401/403 → key permanently removed from pool
└── ⚠️ other 4xx/5xx → passed through as-is
Your agent sees a clean stream or a final error. Never a 429.
Key Status
curl http://localhost:3000/health
{
"status": "ok",
"keys": 3,
"details": [
{
"index": 0,
"key": "nvapi-xxxxxxxxxxxx",
"status": "ready",
"requests_per_minute": 15,
"last_used": "2023-11-15T14:30:00Z",
"cooldown_until": "2023-11-15T14:29:00Z"
},
{
"index": 1,
"key": "nvapi-yyyyyyyyyyyy",
"status": "cooling(42s)",
"requests_per_minute": 40,
"last_used": "2023-11-15T14:31:00Z",
"cooldown_until": "2023-11-15T14:32:00Z"
}
]
}
Other Providers
TARGET_BASE_URL is all you need to change:
# OpenRouter
TARGET_BASE_URL=https://openrouter.ai/api/v1
# Together AI
TARGET_BASE_URL=https://api.together.xyz/v1
# Groq
TARGET_BASE_URL=https://api.groq.com/openai/v1
# Any other OpenAI-compatible endpoint
TARGET_BASE_URL=https://your-provider.com/v1
Running as a Service (systemd)
[Unit]
Description=Alvus
After=network.target
[Service]
ExecStart=/usr/local/bin/alvus
WorkingDirectory=/etc/alvus
Restart=on-failure
RestartSec=5
# Graceful shutdown on stop/restart
KillSignal=SIGTERM
TimeoutStopSec=10
[Install]
WantedBy=multi-user.target
Put your .env in /etc/alvus/. Reload and start:
sudo systemctl daemon-reload
sudo systemctl enable --now alvus
Alvus handles SIGINT and SIGTERM gracefully, allowing in-flight requests to complete before shutting down (with a 5-second timeout).
FAQ
Do I need Go installed to run this?
No. Download a prebuilt binary from Releases.
Are my keys safe?
Keys live in .env on your machine and are only ever sent to the upstream provider. Alvus logs key indices, never key values.
What if ALL keys are cooling?
Alvus waits for the soonest key to become available and retries, up to 10 times. If everything stays exhausted, it returns 503. In practice, with 3 keys and a 60s window this is very hard to trigger.
Can I reload keys without restarting?
Yes! Alvus now supports hot-reloading when the .env file changes. Simply edit your .env file and Alvus will automatically detect the changes and reload the configuration within 1 second. No restart needed.
Does it work on a Raspberry Pi Zero / 32-bit hardware?
Yes. Prebuilt binaries include linux/arm and linux/386. The binary is fully static — no runtime needed.
How much memory does it use?
Around 2 MB at idle. It's a single static Go binary with no runtime overhead — you won't notice it sitting next to a running model.
Roadmap
- Hot-reload when .env changes (no restart needed)
- Per-key request counters and detailed status in
/health - Web dashboard (opt-in, zero-dep binary stays the same)
Contributing
PRs welcome. This project lives in a single file with zero external dependencies — keep it that way. If a feature needs an import beyond stdlib, it doesn't belong in main.go. Open an issue first and we'll figure out the right shape for it.
License
MIT.
Built at 2am when an OpenClaw task hit its fifth 429 in a row.
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found