FreeRideV3

skill
Security Audit
Warn
Health Warn
  • License — License: MIT
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Low visibility — Only 5 GitHub stars
Code Pass
  • Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Pass
  • Permissions — No dangerous permissions requested

No AI report is available for this listing yet.

SUMMARY

One free AI endpoint, every free tier behind it. Local OpenAI-compatible gateway routing across OpenRouter, Groq, NVIDIA NIM, Cloudflare Workers AI, and HuggingFace with automatic failover.

README.md

FreeRide

Ollama for free cloud inference.

A local OpenAI-compatible gateway that routes across every free-tier provider you have a key for — OpenRouter, Groq, NVIDIA NIM, Cloudflare Workers AI, HuggingFace, Cerebras, and your own Ollama. Hits a rate limit, fails over. Your agent never knows.

Install

macOS / Linux:

curl -sSL https://api.free-ride.xyz/install.sh | sh

Windows (PowerShell):

powershell -ExecutionPolicy ByPass -c "irm https://api.free-ride.xyz/install.ps1 | iex"

Then:

freeride init           # interactive — collects keys, writes ~/.freeride/.env
freeride serve          # gateway listens on localhost:11343

Point any OpenAI-shaped client at http://localhost:11343/v1 with OPENAI_API_KEY=any. That's it.

The installer bootstraps uv if missing, then uv tool installs freeride-gateway. Binary lands at ~/.local/bin/freeride (Linux/macOS) or %USERPROFILE%\.local\bin\freeride.exe (Windows). Same shape as the bun.sh and astral.sh installers.

Or install manually
# uv (what the installer does)
uv tool install --prerelease=allow freeride-gateway

# pipx
pipx install --pip-args=--pre freeride-gateway

# pip + venv (the venv only — re-activate per shell)
python3 -m venv .venv && source .venv/bin/activate
pip install --pre freeride-gateway

# from source
git clone https://github.com/Shaivpidadi/FreeRideV3 && cd FreeRideV3
pip install -e .

PyPI distribution: freeride-gateway. CLI: freeride. Python ≥ 3.10.

Get keys (any one is enough; more = better failover)

Provider Where Env var
OpenRouter https://openrouter.ai/keys OPENROUTER_API_KEY
Groq https://console.groq.com/keys GROQ_API_KEY
NVIDIA NIM https://build.nvidia.com NVIDIA_API_KEY
Cloudflare Workers AI https://dash.cloudflare.com/profile/api-tokens CLOUDFLARE_API_TOKEN + CLOUDFLARE_ACCOUNT_ID
HuggingFace https://huggingface.co/settings/tokens HF_TOKEN
Cerebras https://cloud.cerebras.ai/platform CEREBRAS_API_KEY
Ollama (local) https://ollama.com/download OLLAMA_BASE_URL=http://localhost:11434

Set whichever you have, then freeride serve. The gateway picks them up and rotates between them.

Or use the wizard: freeride init writes ~/.freeride/.env for you. The gateway auto-loads that file at startup — no manual source needed.

Wire your agent

The fastest way is a binder:

freeride bind aider       # writes ~/.aider.conf.yml
freeride bind continue    # writes ~/.continue/config.yaml
freeride bind hermes      # writes ~/.hermes/config.yaml
freeride bind openclaw    # writes ~/.openclaw/openclaw.json

Or set the OpenAI vars yourself:

export OPENAI_API_BASE=http://localhost:11343/v1
export OPENAI_API_KEY=any

Anything OpenAI-shaped works. Tested with the openai-python SDK, Aider, Continue, Hermes, OpenClaw.

Multi-key rotation

Got several free keys for the same provider? Pass them as a JSON array:

export OPENROUTER_API_KEY='["sk-or-v1-key1","sk-or-v1-key2","sk-or-v1-key3"]'

When key 1 hits 429 it goes on cooldown for 120s; key 2 takes the next request. Cooldowns persist across restarts (~/.freeride/cooldown.json).

How failover works

Per request, FreeRide walks (provider, key) pairs in order:

  • RATE_LIMIT or AUTH → mark this key cooling, try the next key.
  • MODEL_NOT_FOUND → skip this provider, try the next provider.
  • Anything 5xx-ish → next pair.
  • First successful response → ship it; stamp X-FreeRide-Provider header (or _freeride_provider field on JSON) so you can tell who actually served it.

Streaming uses buffer-first-chunk failover: hold the first SSE event until upstream confirms the stream is real. If it fails before the first chunk, retry. After the first chunk has shipped, mid-stream errors propagate (rare; documented).

Recommended: run freeride audit-models after install

Providers list models they can't always serve. NVIDIA NIM lists Gemma-3-27B but sometimes returns 500. HuggingFace lists models that need PRO credits. The smart-router doesn't know which entries are real until it tries.

freeride audit-models                  # probe every catalog model, ~30s
freeride audit-models --provider groq  # one provider only

This writes ~/.freeride/cache/model_health.json that the smart-router reads at request time, so model: "auto" skips known-broken upstream models without paying a failover-attempt cost. Re-run after big provider changes or if you start seeing surprising 503s.

Stale cache (older than 24h) is auto-refreshed on the next request, but a manual audit-models run is faster than discovering staleness mid-request.

Telemetry

On by default. Hourly POST to https://telemetry.free-ride.xyz/v1/beacon:

{
  "installation_id": "random-uuid-v4",
  "version": "0.3.0",
  "os": "darwin",
  "tokens_served": 412034,
  "request_count": 187,
  "providers_active": ["openrouter", "groq"],
  "uptime_hours": 8
}

Prompts, completions, model IDs, API keys, hostnames, IPs — never sent. The Worker doesn't log cf-connecting-ip. The first time you run any freeride command a banner prints the exact payload.

freeride telemetry off    # turn it off
freeride telemetry        # show what would be sent

Embeddings

Same endpoint shape as OpenAI's /v1/embeddings. Failover across the
4 providers that support embeddings (Groq doesn't):

curl http://localhost:11343/v1/embeddings \
  -H 'Content-Type: application/json' \
  -d '{"model": "text-embedding-3-small", "input": "hello world"}'

The same X-FreeRide-Provider header tells you which provider served
the embedding. Same multi-key rotation, same per-provider failover.

See what FreeRide is doing

freeride watch

Tails live failover events from a running gateway. Every request, every
provider attempt, every rate-limit, every retry. Useful for seeing
failover happen in real time, debugging "is my agent actually using
FreeRide", or just demoing.

[14:23:01.412] req_a3f8e2c1  ▶ request model=openrouter/free stream
[14:23:01.421] req_a3f8e2c1  → openrouter[k0] openrouter/free
[14:23:01.833] req_a3f8e2c1  ← openrouter[k0] 412ms RATE_LIMIT ✗ (retry-after 47s)
[14:23:01.835] req_a3f8e2c1  → groq[k0] openrouter/free
[14:23:02.153] req_a3f8e2c1  ← groq[k0] 318ms OK ✓ first-chunk
[14:23:02.154] req_a3f8e2c1  ■ complete via groq

Events are written to ~/.freeride/events.jsonl. Opt out with
FREERIDE_EVENTS=0 if you don't want them. File caps at 1 MiB with
single-backup rotation.

Commands

freeride serve                  start the gateway
freeride bind <agent>           write gateway URL into agent config
freeride watch                  tail live failover events
freeride bench                  per-provider latency comparison (needs serve running)
freeride reload                 refresh provider registry from env vars (no restart)
freeride providers              live provider health from a running gateway
freeride doctor                 diagnose common setup issues (env vars, PATH, port)
freeride upgrade                bump installed package to latest PyPI release
freeride init                   interactive setup wizard — prompts for keys, writes ~/.freeride/.env
freeride keys                   show which provider keys are available vs cooling
freeride telemetry [on|off]     manage telemetry
freeride list                   list available free models
freeride status                 show OpenClaw config + cache age (v2)
freeride auto                   auto-configure OpenClaw (v2)
freeride rotate                 swap primary if it fails (v2)
freeride-watcher                background daemon that rotates on failure

freeride bench example output:

$ freeride bench
Benchmarking 5 providers, 3 requests each via http://localhost:11343/v1...

provider              ok    p50      p95      tok/s
─────────────────────────────────────────────────────
groq                  3/3   142ms    287ms    98
cloudflare_wai        3/3   284ms    410ms    81
nvidia_nim            3/3   389ms    502ms    72
openrouter            3/3   412ms    721ms    63
huggingface           2/3   612ms    1840ms   41

Fastest: groq (142ms p50)

The v2 commands keep working for existing OpenClaw users.

Providers

Provider Status Notes
OpenRouter shipped full surface — chat, streaming, tools, vision, structured outputs
NVIDIA NIM shipped curated free-model allowlist; NVIDIA_NIM_FREE_MODELS_OVERRIDE to expand
Groq shipped hardcoded allowlist (Llama 3.x, Gemma 2, Mixtral, DeepSeek-R1-distill); GROQ_FREE_MODELS_OVERRIDE to expand
Cloudflare Workers AI shipped curated allowlist of cheap-per-neuron chat models; needs CLOUDFLARE_ACCOUNT_ID
HuggingFace Inference shipped full HF router catalog; budget governs access ($0.10/mo Free, $2/mo PRO)
Cerebras shipped fastest Llama / Qwen inference; chat-only (no embeddings). CEREBRAS_FREE_MODELS_OVERRIDE to restrict catalog.
Ollama (local) shipped local-only; mix with remote providers in the same failover chain. Set OLLAMA_BASE_URL to opt in.

Adding a sixth: implement freeride.core.provider.Provider (api_version=1) in freeride/providers/<name>.py, register it in the conformance suite, done. See CONTRIBUTING.md.

Agents

Agent freeride bind Hot reload
OpenClaw yes needs restart
Aider yes (--scope home/cwd/git) needs restart
Continue yes yes
Hermes (NousResearch/hermes-agent) yes needs restart

Or anything else: OPENAI_API_BASE=http://localhost:11343/v1 + OPENAI_API_KEY=any.

Claude Code

Two ways FreeRide plays with Claude Code:

1. freeride run claude — companion mode (the main path)

freeride run claude

Wraps a Claude Code session so free providers are available alongside your
subscription. Your Pro/Max OAuth (or ANTHROPIC_API_KEY) is preserved.
Inside the session, flip per request via /model:

You type What happens
/model claude-opus-4-7 Your subscription answers (passthrough to api.anthropic.com).
/model freeride/free Free provider answers via smart-routing.
/model freeride/fast Free, prefers groq (low TTFT).
/model freeride/quality Free, prefers OpenRouter (widest catalog).
/model freeride/coding Free, prefers code-tuned models (Qwen-Coder, DeepSeek).

Plain claude (no wrapper) goes direct to Anthropic — FreeRide is invisible.
The wrapper sets ANTHROPIC_BASE_URL for the child process only; nothing
system-wide changes.

Probe the setup: freeride doctor --claude-code.

Full guide: docs/claude-code.md.

2. Skill / plugin install (in-Claude awareness)

If you want Claude itself to know about FreeRide (detect it running, suggest
the wrapper, help troubleshoot):

/plugin install https://github.com/Shaivpidadi/FreeRideV3

See skills/README.md for manual-install instructions.

Docs

License

MIT.

Reviews (0)

No results found