awesome-ai-gateway

mcp
Security Audit
Pass
Health Pass
  • License — License: CC0-1.0
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Community trust — 10 GitHub stars
Code Pass
  • Code scan — Scanned 11 files during light audit, no dangerous patterns found
Permissions Pass
  • Permissions — No dangerous permissions requested

No AI report is available for this listing yet.

SUMMARY

⚡ Curated list of AI gateways & LLM proxies — LiteLLM, OpenRouter, Portkey, Kong, Higress, new-api… compared by cost, compliance, self-hosting & routing. Updated daily.

README.md

Awesome AI Gateway Awesome

GitHub stars
Gateways
Data updated daily
Evaluation set
PRs Welcome
License: CC0
Last commit

A curated list of AI gateways & LLM proxies — LiteLLM, OpenRouter, Portkey, Kong, Higress, new-api and 50+ more — organized by what you actually need: lowest cost, compliance, self-hosting, smart routing, or China-ecosystem support. Star counts and releases refresh daily via CI.

Languages: English · 简体中文

An AI gateway sits between your code and LLM providers: one endpoint, one key, many models. It handles routing, failover, caching, rate limits, cost tracking and guardrails — so you change a base_url instead of rewriting your app for every provider.

📊 New: Evaluation Set →

Not just a link list — a sourced, reproducible evaluation layer: model benchmarks (AA Index, GPQA, SWE-bench, Arena), real-world token cost computed by a unit-tested script (writing a 100K-token report costs $0.03 on DeepSeek vs $3.01 on GPT-5.5 — a 106× spread), and a gateway scorecard rating compliance · price · security · stability. Read it →

Contents

Which gateway should I use?

Do you want to self-host?
│
├─ NO — hosted, minimal ops
│   ├─ Cheapest access to many models ──────────▶ OpenRouter · Vercel AI Gateway (0% markup)
│   ├─ Free control plane over your own keys ───▶ Cloudflare AI Gateway
│   ├─ EU data residency matters ───────────────▶ Requesty · Eden AI · nexos.ai
│   └─ Already on one cloud ────────────────────▶ AWS Bedrock · Azure APIM · Vertex AI
│
└─ YES — self-hosted / open source
    ├─ Python stack, broadest features ─────────▶ LiteLLM
    ├─ Raw performance (Go/Rust/TS) ────────────▶ Bifrost · TensorZero · Portkey Gateway
    ├─ Built-in evals + observability ──────────▶ TensorZero · Helicone
    ├─ Key distribution / billing / CN models ──▶ new-api · one-api · GPT-Load
    ├─ Enterprise K8s, audit, guardrails ───────▶ Kong · Higress · APISIX · Envoy AI Gateway
    └─ Governing AI agents & MCP traffic ───────▶ agentgateway · Lunar.dev

Quick start (drop-in)

The whole promise of a gateway: change base_url, keep your OpenAI code. Same request, now with routing, fallback, caching and cost tracking.

from openai import OpenAI

# Hosted example — OpenRouter (400+ models, one key):
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="sk-or-...",
)

# Self-hosted example — a LiteLLM proxy you run:
client = OpenAI(
    base_url="http://localhost:4000",
    api_key="sk-litellm-...",
)

resp = client.chat.completions.create(
    model="anthropic/claude-fable-5",        # ask the gateway for any provider's model
    messages=[{"role": "user", "content": "Hello!"}],
)

Base URLs for the rest are on each project's docs (linked below). Most are OpenAI-compatible, so the only change is these two lines.

Quick comparison

Stars auto-refresh daily. ✅ built-in · ➕ via plugin/paid tier · ❌ not available.

Project Type Stars License Multi-provider Fallback / LB Caching Guardrails Cost tracking
LiteLLM OSS proxy + SDK ⭐ 50k MIT¹ ✅ 100+
new-api OSS relay/billing ⭐ 38.3k AGPL-3.0
one-api OSS relay/billing ⭐ 34.8k MIT
Kong AI Gateway OSS API gateway ⭐ 43.6k Apache-2.0 ✅ semantic
Apache APISIX OSS API gateway ⭐ 16.7k Apache-2.0
Portkey Gateway OSS gateway + SaaS ⭐ 12k MIT ✅ 1600+ ✅ 50+ ➕ SaaS
TensorZero OSS LLMOps stack ⭐ 11.5k Apache-2.0
Higress OSS AI-native gateway ⭐ 8.6k Apache-2.0
GPT-Load OSS key-pool proxy ⭐ 6.2k MIT ✅ key rotation
Bifrost OSS gateway (Go) ⭐ 5.7k Apache-2.0 ✅ adaptive
Helicone OSS observability + gateway ⭐ 5.8k Apache-2.0
Envoy AI Gateway OSS K8s gateway ⭐ 1.7k Apache-2.0
OpenRouter SaaS marketplace Commercial ✅ 400+
Vercel AI Gateway SaaS (0% markup) Commercial ✅ 100s
Cloudflare AI Gateway SaaS control plane Commercial (free tier) ✅ dynamic ✅ budgets

¹ LiteLLM core is MIT; the repo contains a separately licensed enterprise directory.

💰 Cost-first: cheapest multi-model access

Pain point: "I want many models for the least money and zero ops."

  • OpenRouter — The dominant model marketplace: 400+ models behind one OpenAI-compatible API, pay-as-you-go with automatic failover; ~5.5% fee when buying credits. $113M Series B (May 2026), ~8M users.
  • Vercel AI Gateway — Hundreds of models at provider list price (0% markup), $5/month free credits, zero-data-retention option; pairs naturally with the AI SDK.
  • Cloudflare AI Gateway — Free control plane in front of your own provider keys: caching, dynamic routing, unified billing, and dollar-denominated spend limits (2026 beta).
  • Requesty — EU-friendly OpenRouter alternative: 400+ models, sub-20ms failover, ~5% markup.
  • Eden AI — Unified API for 500+ models plus vision/OCR/speech; EU-based, ~5.5% platform fee.
  • Helicone AI Gateway (cloud) — Passthrough billing at 0% markup with observability bundled.
  • GPT-Load ⭐ 6.2k — High-performance Go proxy that rotates pools of API keys across channels to maximize quota usage.

💡 Squeeze more from any gateway: enable semantic caching (Kong, Bifrost, Zuplo), set spend limits (Cloudflare, Zuplo, Pydantic/Logfire), and route easy prompts to cheap models (see Smart routing).

🔓 Self-hosted open source

Pain point: "My keys, my infra, no per-token middleman fee."

  • LiteLLM ⭐ 50k — The default choice: Python SDK + proxy server speaking OpenAI format to 100+ providers, with virtual keys, budgets, load balancing and guardrails.
  • Portkey Gateway ⭐ 12k — Fast TypeScript gateway (1,600+ models, 50+ guardrails) that also powers Portkey's commercial LLMOps platform.
  • TensorZero ⭐ 11.5k — Rust gateway unified with observability, evals, experimentation and optimization — built around a data/feedback flywheel.
  • Bifrost ⭐ 5.7k — Go gateway from Maxim AI claiming ~50x LiteLLM throughput; adaptive load balancing, cluster mode, MCP support.
  • Helicone ⭐ 5.8k — Observability-first platform (YC W23) with a Rust ai-gateway ⭐ 600.
  • Plano ⭐ 6.6k — AI-native proxy and data plane for agents (formerly Arch Gateway / archgw).
  • LLM Gateway ⭐ 1.3k — Open-source OpenRouter alternative: route, manage and analyze requests across providers.
  • APIPark ⭐ 1.8k — Cloud-native LLM API management and distribution platform.
  • Pydantic AI Gateway ⭐ 189 — BYOK gateway with cost caps and OTel, now folded into Pydantic Logfire.
  • OptiLLM ⭐ 4.1k — Optimizing inference proxy that boosts accuracy via test-time compute techniques.
  • aisuite ⭐ 13.8k — Andrew Ng's unified multi-provider client. A library rather than a deployable proxy — fits when you don't want network hops.
  • ⚠️ Stale but historically notable: BricksLLM ⭐ 1.2k (PII masking, per-key limits; inactive since early 2025), Glide ⭐ 160 (inactive since 2024).

🏢 Enterprise & compliance

Pain point: "Audit logs, PII redaction, RBAC, on-prem, and the EU AI Act (enforceable Aug 2026)."

  • Kong AI Gateway ⭐ 43.6k — Mature API gateway with AI plugins: semantic caching/routing, prompt guard, token rate-limiting; Konnect for managed control plane.
  • Apache APISIX ⭐ 16.7k — Cloud-native API + AI gateway with ai-proxy / ai-proxy-multi plugins.
  • Envoy AI Gateway ⭐ 1.7k — CNCF-aligned GenAI access on Envoy Gateway, backed by Tetrate and Bloomberg.
  • kgateway ⭐ 5.6k — CNCF API/AI gateway, the base of Solo.io's commercial Gloo AI Gateway.
  • TrueFoundry AI Gateway — Enterprise gateway with routing, guardrails and RBAC, deployable into your K8s/VPC.
  • nexos.ai — Enterprise AI gateway/orchestration from the Nord Security founders (€30M Series A, Oct 2025).
  • Tyk AI Studio — AI governance suite: budgets, model catalogs, guardrails on Tyk's gateway.
  • Gravitee Agent Mesh — LLM Proxy, MCP Proxy and A2A support inside Gravitee APIM.
  • WSO2 AI Gateway — Egress management for LLM traffic: model routing, semantic caching, guardrails.
  • F5 AI Gateway — Containerized AI traffic gateway; data-leakage detection via the LeakSignal acquisition (Nov 2025).
  • IBM API Connect AI Gateway — Policy enforcement, masking and audit for LLM traffic.
  • MuleSoft AI / Omni Gateway — Governs LLM, MCP and agent traffic alongside classic APIs.
  • Lunar.dev ⭐ 453 — Egress consumption gateway repositioned around MCP/agent governance.

☁️ First-party gateways (cloud & model vendors)

Pain point: "We're already committed to one cloud — give us the native path."

🇨🇳 China ecosystem

Pain point: "Domestic models (Qwen/DeepSeek/GLM/Kimi), CNY payment, key distribution & billing for teams."

  • new-api ⭐ 38.3k — The most active one-api fork, now a "unified AI model hub": protocol conversion, billing, Rerank/Realtime endpoints. AGPL-3.0.
  • one-api ⭐ 34.8k — The original LLM API 管理&分发系统 (OpenAI/Azure/Claude/Gemini/DeepSeek/豆包…); development has slowed.
  • Higress ⭐ 8.6k — Alibaba's AI-native gateway on Envoy/Istio, first-class 通义/DeepSeek support; hosted version at higress.ai.
  • GPT-Load ⭐ 6.2k — 智能密钥轮询 multi-channel proxy in Go.
  • one-hub ⭐ 2.8k — one-api fork with better non-OpenAI function calling and stats.
  • simple-one-api ⭐ 2.3k — Single binary adapting 千帆/星火/混元/MiniMax/DeepSeek to the OpenAI interface.
  • Veloera ⭐ 1.6k — Newer relay platform in the one-api/new-api lineage.
  • uni-api ⭐ 1.2k — Lightweight single-config unified API manager, no frontend.
  • APIPark ⭐ 1.8k — China-origin, cloud-native AI & API gateway with an open developer portal.

⚠️ This list deliberately excludes reverse-engineered "free-api" relays (ToS violations, account risk). For commercial 中转站 price comparisons, see awesome-ai-api-proxy.

🧠 Smart routing & model selection

Pain point: "Send each prompt to the cheapest model that can handle it."

  • Not Diamond — SOTA model-routing intelligence; powers OpenRouter's Auto router.
  • Martian — Pioneer commercial model router; Accenture partnership.
  • RouteLLM ⭐ 5k — LMSYS's open router framework (research-grade; inactive since 2024 but still the canonical paper/code).
  • OpenRouter Auto — One model id (openrouter/auto) that routes per-prompt.
  • Unify — Early neural LLM router (company since pivoted to agents).
  • Bifrost adaptive load balancing / Cloudflare dynamic routing — routing built into gateways themselves.

📊 Observability & cost tracking

Pain point: "Who spent what, on which model, and why did quality drop?"

  • Helicone ⭐ 5.8k — Logs, costs, sessions, prompt experiments; one-line proxy integration.
  • TensorZero ⭐ 11.5k — Gateway + observability + evals in one Rust binary, data stays in your ClickHouse.
  • Portkey — Full LLMOps suite over its OSS gateway: traces, budgets, prompt management.
  • vLLora (ex-LangDB) ⭐ 802 — Agent debugging and observability from the LangDB team.
  • Braintrust Proxy ⭐ 398 — Caching proxy wired into Braintrust evals.
  • MLflow AI Gateway ⭐ 26.4k — Unified endpoints + governance inside the MLflow platform.

🤖 MCP & agent gateways

Pain point: "Agents call tools now — govern MCP traffic like you govern APIs." The newest category (2025–2026).

☸️ Kubernetes-native & inference infra

Pain point: "Routing to self-hosted models (vLLM/Ollama) inside the cluster, GPU-aware."

  • Gateway API Inference Extension ⭐ 690 — The Kubernetes standard for inference-aware routing.
  • AIBrix ⭐ 4.9k — Cost-efficient control plane for vLLM on K8s (ByteDance-origin).
  • llm-d ⭐ 3.3k — K8s-native distributed inference serving (Red Hat/Google/IBM-backed).
  • Higress ⭐ 8.6k / Kong ⭐ 43.6k / Envoy AI Gateway ⭐ 1.7k — all implement inference-extension-style routing.
  • Traefik Hub AI Gateway — LLM routing/security in Traefik's commercial runtime.
  • Inference Gateway ⭐ 124 — Small cloud-native gateway unifying cloud + local (Ollama) providers.

📰 What's new

Curated monthly. Last review: 2026-06-11.

  • 2026-05 · OpenRouter raised a $113M Series B led by CapitalG at a $1.3B valuation — ~8M users, ~100T tokens/month. (TechCrunch)
  • 2026-06 · NetFoundry launched zero-trust MCP and LLM gateways; Cisco Investments joined its Series A. (PR Newswire)
  • 2026 · Cloudflare AI Gateway shipped dollar-denominated spend limits (public beta) on top of dynamic routing and unified billing. (Cloudflare blog)
  • 2025-11 · Pydantic AI Gateway went open beta and has since merged into Logfire; F5 added data-leakage detection to its AI Gateway via the LeakSignal acquisition. (Pydantic Logfire, F5)
  • Trend · MCP gateways emerged as a distinct category; spend-limit enforcement became table stakes; the EU AI Act (enforceable Aug 2026) is driving the compliance bucket; new-api overtook one-api as the most active China-ecosystem relay.

🚀 Recent releases (auto-updated)

How to choose safely

  1. Check the markup. Marketplaces charge 0–6% — for high volume, self-hosting or 0%-markup gateways (Vercel, Helicone cloud) pay for themselves fast.
  2. Verify model fidelity. Some relays silently downgrade models. Send a canary prompt (e.g. a known hard reasoning question) through the gateway and direct to the provider, then diff.
  3. Mind data flow. Every gateway sees your prompts. For sensitive data: self-host, or require ZDR (zero data retention) in writing.
  4. License check before embedding. new-api is AGPL-3.0; LiteLLM has an enterprise-licensed directory; "open core" ≠ everything free.
  5. Project health. Star count ≠ maintenance. Check last release date — several once-popular gateways (BricksLLM, Glide, RouteLLM) are effectively unmaintained; this list labels them.
  6. Avoid gray-market relays reselling reverse-engineered or stolen-quota access — account bans and data leaks are your risk, not theirs.

FAQ

What is an AI gateway (LLM gateway)?
A proxy between your code and LLM providers: one OpenAI-compatible endpoint and key for many models, adding routing, failover, caching, rate limits, cost tracking and guardrails. See the intro.

AI gateway vs LLM router — what's the difference?
A router decides which model gets each request (e.g. cheap vs flagship); a gateway is the full proxy layer (auth, caching, observability, guardrails) that usually includes routing. See smart routing.

What's the best open-source AI gateway?
LiteLLM is the default for breadth (Python, 100+ providers). For raw performance pick Bifrost (Go) or TensorZero (Rust); for enterprise K8s pick Kong or Higress. Full list under self-hosted.

LiteLLM vs OpenRouter — which should I use?
OpenRouter is hosted (zero ops, ~5.5% fee, 400+ models); LiteLLM is self-hosted (your keys, your infra, $0 markup). Hosted to start, self-host when volume justifies it. Cost math in the evaluation set.

What's the cheapest way to call many LLMs?
For zero ops: Vercel AI Gateway or Cloudflare AI Gateway (0% markup). For lowest token cost, route bulk work to cheap models — a 100K-token report runs $0.03 on DeepSeek vs $3.01 on GPT-5.5. See cost-first.

Are AI gateways safe? Who sees my prompts?
Every gateway sees your prompts. For sensitive data self-host or require zero-data-retention in writing; check the gateway scorecard for compliance/security ratings and known CVEs.

Contributing

Contributions welcome! Please read CONTRIBUTING.md first. Inclusion criteria, in short: the project must be an actual gateway/proxy/router for LLM or agent traffic (not an SDK wrapper or chat UI), publicly available, and active within the last 12 months — or clearly labeled as stale.

Star history

Star History Chart

License

CC0

To the extent possible under law, the contributors have waived all copyright and related rights to this work.

Reviews (0)

No results found