awesome-ai-gateway
Health Gecti
- License — License: CC0-1.0
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Community trust — 10 GitHub stars
Code Gecti
- Code scan — Scanned 11 files during light audit, no dangerous patterns found
Permissions Gecti
- Permissions — No dangerous permissions requested
Bu listing icin henuz AI raporu yok.
⚡ Curated list of AI gateways & LLM proxies — LiteLLM, OpenRouter, Portkey, Kong, Higress, new-api… compared by cost, compliance, self-hosting & routing. Updated daily.
Awesome AI Gateway 
A curated list of AI gateways & LLM proxies — LiteLLM, OpenRouter, Portkey, Kong, Higress, new-api and 50+ more — organized by what you actually need: lowest cost, compliance, self-hosting, smart routing, or China-ecosystem support. Star counts and releases refresh daily via CI.
Languages: English · 简体中文
An AI gateway sits between your code and LLM providers: one endpoint, one key, many models. It handles routing, failover, caching, rate limits, cost tracking and guardrails — so you change a base_url instead of rewriting your app for every provider.
📊 New: Evaluation Set →
Not just a link list — a sourced, reproducible evaluation layer: model benchmarks (AA Index, GPQA, SWE-bench, Arena), real-world token cost computed by a unit-tested script (writing a 100K-token report costs $0.03 on DeepSeek vs $3.01 on GPT-5.5 — a 106× spread), and a gateway scorecard rating compliance · price · security · stability. Read it →
Contents
- 📊 Evaluation set: model benchmarks · token cost · gateway scorecard
- Which gateway should I use? (decision tree)
- Quick comparison
- 💰 Cost-first: cheapest multi-model access
- 🔓 Self-hosted open source
- 🏢 Enterprise & compliance
- ☁️ First-party gateways (cloud & model vendors)
- 🇨🇳 China ecosystem
- 🧠 Smart routing & model selection
- 📊 Observability & cost tracking
- 🤖 MCP & agent gateways
- ☸️ Kubernetes-native & inference infra
- 📰 What's new
- 🚀 Recent releases (auto-updated)
- How to choose safely
- Contributing
Which gateway should I use?
Do you want to self-host?
│
├─ NO — hosted, minimal ops
│ ├─ Cheapest access to many models ──────────▶ OpenRouter · Vercel AI Gateway (0% markup)
│ ├─ Free control plane over your own keys ───▶ Cloudflare AI Gateway
│ ├─ EU data residency matters ───────────────▶ Requesty · Eden AI · nexos.ai
│ └─ Already on one cloud ────────────────────▶ AWS Bedrock · Azure APIM · Vertex AI
│
└─ YES — self-hosted / open source
├─ Python stack, broadest features ─────────▶ LiteLLM
├─ Raw performance (Go/Rust/TS) ────────────▶ Bifrost · TensorZero · Portkey Gateway
├─ Built-in evals + observability ──────────▶ TensorZero · Helicone
├─ Key distribution / billing / CN models ──▶ new-api · one-api · GPT-Load
├─ Enterprise K8s, audit, guardrails ───────▶ Kong · Higress · APISIX · Envoy AI Gateway
└─ Governing AI agents & MCP traffic ───────▶ agentgateway · Lunar.dev
Quick start (drop-in)
The whole promise of a gateway: change base_url, keep your OpenAI code. Same request, now with routing, fallback, caching and cost tracking.
from openai import OpenAI
# Hosted example — OpenRouter (400+ models, one key):
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="sk-or-...",
)
# Self-hosted example — a LiteLLM proxy you run:
client = OpenAI(
base_url="http://localhost:4000",
api_key="sk-litellm-...",
)
resp = client.chat.completions.create(
model="anthropic/claude-fable-5", # ask the gateway for any provider's model
messages=[{"role": "user", "content": "Hello!"}],
)
Base URLs for the rest are on each project's docs (linked below). Most are OpenAI-compatible, so the only change is these two lines.
Quick comparison
Stars auto-refresh daily. ✅ built-in · ➕ via plugin/paid tier · ❌ not available.
| Project | Type | Stars | License | Multi-provider | Fallback / LB | Caching | Guardrails | Cost tracking |
|---|---|---|---|---|---|---|---|---|
| LiteLLM | OSS proxy + SDK | ⭐ 50k | MIT¹ | ✅ 100+ | ✅ | ✅ | ✅ | ✅ |
| new-api | OSS relay/billing | ⭐ 38.3k | AGPL-3.0 | ✅ | ✅ | ➕ | ➕ | ✅ |
| one-api | OSS relay/billing | ⭐ 34.8k | MIT | ✅ | ✅ | ❌ | ❌ | ✅ |
| Kong AI Gateway | OSS API gateway | ⭐ 43.6k | Apache-2.0 | ✅ | ✅ | ✅ semantic | ✅ | ✅ |
| Apache APISIX | OSS API gateway | ⭐ 16.7k | Apache-2.0 | ✅ | ✅ | ➕ | ➕ | ➕ |
| Portkey Gateway | OSS gateway + SaaS | ⭐ 12k | MIT | ✅ 1600+ | ✅ | ✅ | ✅ 50+ | ➕ SaaS |
| TensorZero | OSS LLMOps stack | ⭐ 11.5k | Apache-2.0 | ✅ | ✅ | ✅ | ➕ | ✅ |
| Higress | OSS AI-native gateway | ⭐ 8.6k | Apache-2.0 | ✅ | ✅ | ✅ | ✅ | ✅ |
| GPT-Load | OSS key-pool proxy | ⭐ 6.2k | MIT | ✅ | ✅ key rotation | ❌ | ❌ | ➕ |
| Bifrost | OSS gateway (Go) | ⭐ 5.7k | Apache-2.0 | ✅ | ✅ adaptive | ✅ | ✅ | ✅ |
| Helicone | OSS observability + gateway | ⭐ 5.8k | Apache-2.0 | ✅ | ✅ | ✅ | ➕ | ✅ |
| Envoy AI Gateway | OSS K8s gateway | ⭐ 1.7k | Apache-2.0 | ✅ | ✅ | ➕ | ➕ | ✅ |
| OpenRouter | SaaS marketplace | — | Commercial | ✅ 400+ | ✅ | ✅ | ➕ | ✅ |
| Vercel AI Gateway | SaaS (0% markup) | — | Commercial | ✅ 100s | ✅ | ❌ | ❌ | ✅ |
| Cloudflare AI Gateway | SaaS control plane | — | Commercial (free tier) | ✅ | ✅ dynamic | ✅ | ✅ | ✅ budgets |
¹ LiteLLM core is MIT; the repo contains a separately licensed enterprise directory.
💰 Cost-first: cheapest multi-model access
Pain point: "I want many models for the least money and zero ops."
- OpenRouter — The dominant model marketplace: 400+ models behind one OpenAI-compatible API, pay-as-you-go with automatic failover; ~5.5% fee when buying credits. $113M Series B (May 2026), ~8M users.
- Vercel AI Gateway — Hundreds of models at provider list price (0% markup), $5/month free credits, zero-data-retention option; pairs naturally with the AI SDK.
- Cloudflare AI Gateway — Free control plane in front of your own provider keys: caching, dynamic routing, unified billing, and dollar-denominated spend limits (2026 beta).
- Requesty — EU-friendly OpenRouter alternative: 400+ models, sub-20ms failover, ~5% markup.
- Eden AI — Unified API for 500+ models plus vision/OCR/speech; EU-based, ~5.5% platform fee.
- Helicone AI Gateway (cloud) — Passthrough billing at 0% markup with observability bundled.
- GPT-Load ⭐ 6.2k — High-performance Go proxy that rotates pools of API keys across channels to maximize quota usage.
💡 Squeeze more from any gateway: enable semantic caching (Kong, Bifrost, Zuplo), set spend limits (Cloudflare, Zuplo, Pydantic/Logfire), and route easy prompts to cheap models (see Smart routing).
🔓 Self-hosted open source
Pain point: "My keys, my infra, no per-token middleman fee."
- LiteLLM ⭐ 50k — The default choice: Python SDK + proxy server speaking OpenAI format to 100+ providers, with virtual keys, budgets, load balancing and guardrails.
- Portkey Gateway ⭐ 12k — Fast TypeScript gateway (1,600+ models, 50+ guardrails) that also powers Portkey's commercial LLMOps platform.
- TensorZero ⭐ 11.5k — Rust gateway unified with observability, evals, experimentation and optimization — built around a data/feedback flywheel.
- Bifrost ⭐ 5.7k — Go gateway from Maxim AI claiming ~50x LiteLLM throughput; adaptive load balancing, cluster mode, MCP support.
- Helicone ⭐ 5.8k — Observability-first platform (YC W23) with a Rust ai-gateway ⭐ 600.
- Plano ⭐ 6.6k — AI-native proxy and data plane for agents (formerly Arch Gateway / archgw).
- LLM Gateway ⭐ 1.3k — Open-source OpenRouter alternative: route, manage and analyze requests across providers.
- APIPark ⭐ 1.8k — Cloud-native LLM API management and distribution platform.
- Pydantic AI Gateway ⭐ 189 — BYOK gateway with cost caps and OTel, now folded into Pydantic Logfire.
- OptiLLM ⭐ 4.1k — Optimizing inference proxy that boosts accuracy via test-time compute techniques.
- aisuite ⭐ 13.8k — Andrew Ng's unified multi-provider client. A library rather than a deployable proxy — fits when you don't want network hops.
- ⚠️ Stale but historically notable: BricksLLM ⭐ 1.2k (PII masking, per-key limits; inactive since early 2025), Glide ⭐ 160 (inactive since 2024).
🏢 Enterprise & compliance
Pain point: "Audit logs, PII redaction, RBAC, on-prem, and the EU AI Act (enforceable Aug 2026)."
- Kong AI Gateway ⭐ 43.6k — Mature API gateway with AI plugins: semantic caching/routing, prompt guard, token rate-limiting; Konnect for managed control plane.
- Apache APISIX ⭐ 16.7k — Cloud-native API + AI gateway with
ai-proxy/ai-proxy-multiplugins. - Envoy AI Gateway ⭐ 1.7k — CNCF-aligned GenAI access on Envoy Gateway, backed by Tetrate and Bloomberg.
- kgateway ⭐ 5.6k — CNCF API/AI gateway, the base of Solo.io's commercial Gloo AI Gateway.
- TrueFoundry AI Gateway — Enterprise gateway with routing, guardrails and RBAC, deployable into your K8s/VPC.
- nexos.ai — Enterprise AI gateway/orchestration from the Nord Security founders (€30M Series A, Oct 2025).
- Tyk AI Studio — AI governance suite: budgets, model catalogs, guardrails on Tyk's gateway.
- Gravitee Agent Mesh — LLM Proxy, MCP Proxy and A2A support inside Gravitee APIM.
- WSO2 AI Gateway — Egress management for LLM traffic: model routing, semantic caching, guardrails.
- F5 AI Gateway — Containerized AI traffic gateway; data-leakage detection via the LeakSignal acquisition (Nov 2025).
- IBM API Connect AI Gateway — Policy enforcement, masking and audit for LLM traffic.
- MuleSoft AI / Omni Gateway — Governs LLM, MCP and agent traffic alongside classic APIs.
- Lunar.dev ⭐ 453 — Egress consumption gateway repositioned around MCP/agent governance.
☁️ First-party gateways (cloud & model vendors)
Pain point: "We're already committed to one cloud — give us the native path."
- AWS Bedrock — Multi-model access via the unified Converse API, cross-region inference, and AgentCore Gateway for tools/MCP.
- Azure API Management — GenAI gateway — Token limits, semantic caching and load balancing in front of Azure OpenAI / AI Foundry.
- Google Apigee + Vertex AI — LLM gateway patterns on Apigee with Vertex Model Garden as the managed hub.
- Cloudflare AI Gateway — See Cost-first; the strongest free first-party option.
- Vercel AI Gateway — GA, 0% markup, ZDR option; the default for Next.js/AI SDK shops.
- Databricks Unity AI Gateway — Mosaic AI Gateway folded into Unity Catalog, adding agent + MCP governance.
🇨🇳 China ecosystem
Pain point: "Domestic models (Qwen/DeepSeek/GLM/Kimi), CNY payment, key distribution & billing for teams."
- new-api ⭐ 38.3k — The most active one-api fork, now a "unified AI model hub": protocol conversion, billing, Rerank/Realtime endpoints. AGPL-3.0.
- one-api ⭐ 34.8k — The original LLM API 管理&分发系统 (OpenAI/Azure/Claude/Gemini/DeepSeek/豆包…); development has slowed.
- Higress ⭐ 8.6k — Alibaba's AI-native gateway on Envoy/Istio, first-class 通义/DeepSeek support; hosted version at higress.ai.
- GPT-Load ⭐ 6.2k — 智能密钥轮询 multi-channel proxy in Go.
- one-hub ⭐ 2.8k — one-api fork with better non-OpenAI function calling and stats.
- simple-one-api ⭐ 2.3k — Single binary adapting 千帆/星火/混元/MiniMax/DeepSeek to the OpenAI interface.
- Veloera ⭐ 1.6k — Newer relay platform in the one-api/new-api lineage.
- uni-api ⭐ 1.2k — Lightweight single-config unified API manager, no frontend.
- APIPark ⭐ 1.8k — China-origin, cloud-native AI & API gateway with an open developer portal.
⚠️ This list deliberately excludes reverse-engineered "free-api" relays (ToS violations, account risk). For commercial 中转站 price comparisons, see awesome-ai-api-proxy.
🧠 Smart routing & model selection
Pain point: "Send each prompt to the cheapest model that can handle it."
- Not Diamond — SOTA model-routing intelligence; powers OpenRouter's Auto router.
- Martian — Pioneer commercial model router; Accenture partnership.
- RouteLLM ⭐ 5k — LMSYS's open router framework (research-grade; inactive since 2024 but still the canonical paper/code).
- OpenRouter Auto — One model id (
openrouter/auto) that routes per-prompt. - Unify — Early neural LLM router (company since pivoted to agents).
- Bifrost adaptive load balancing / Cloudflare dynamic routing — routing built into gateways themselves.
📊 Observability & cost tracking
Pain point: "Who spent what, on which model, and why did quality drop?"
- Helicone ⭐ 5.8k — Logs, costs, sessions, prompt experiments; one-line proxy integration.
- TensorZero ⭐ 11.5k — Gateway + observability + evals in one Rust binary, data stays in your ClickHouse.
- Portkey — Full LLMOps suite over its OSS gateway: traces, budgets, prompt management.
- vLLora (ex-LangDB) ⭐ 802 — Agent debugging and observability from the LangDB team.
- Braintrust Proxy ⭐ 398 — Caching proxy wired into Braintrust evals.
- MLflow AI Gateway ⭐ 26.4k — Unified endpoints + governance inside the MLflow platform.
🤖 MCP & agent gateways
Pain point: "Agents call tools now — govern MCP traffic like you govern APIs." The newest category (2025–2026).
- agentgateway ⭐ 3.2k — CNCF proxy for agentic traffic: MCP governance and agent-to-agent (A2A) communication.
- Lunar.dev MCPX ⭐ 453 — Gateway for managing MCP server consumption.
- Tetrate Agent Router Service — Managed Envoy AI Gateway fleet: LLM + MCP gateway with guardrails (~5% fee).
- Zuplo AI Gateway — Programmable policies: USD spend limits, prompt-injection detection, secret masking, MCP support.
- NetFoundry MCP/LLM Gateways — Zero-trust gateways for AI deployments (launched June 2026).
- AWS AgentCore Gateway — Tool/MCP gateway inside Bedrock AgentCore.
☸️ Kubernetes-native & inference infra
Pain point: "Routing to self-hosted models (vLLM/Ollama) inside the cluster, GPU-aware."
- Gateway API Inference Extension ⭐ 690 — The Kubernetes standard for inference-aware routing.
- AIBrix ⭐ 4.9k — Cost-efficient control plane for vLLM on K8s (ByteDance-origin).
- llm-d ⭐ 3.3k — K8s-native distributed inference serving (Red Hat/Google/IBM-backed).
- Higress ⭐ 8.6k / Kong ⭐ 43.6k / Envoy AI Gateway ⭐ 1.7k — all implement inference-extension-style routing.
- Traefik Hub AI Gateway — LLM routing/security in Traefik's commercial runtime.
- Inference Gateway ⭐ 124 — Small cloud-native gateway unifying cloud + local (Ollama) providers.
📰 What's new
Curated monthly. Last review: 2026-06-11.
- 2026-05 · OpenRouter raised a $113M Series B led by CapitalG at a $1.3B valuation — ~8M users, ~100T tokens/month. (TechCrunch)
- 2026-06 · NetFoundry launched zero-trust MCP and LLM gateways; Cisco Investments joined its Series A. (PR Newswire)
- 2026 · Cloudflare AI Gateway shipped dollar-denominated spend limits (public beta) on top of dynamic routing and unified billing. (Cloudflare blog)
- 2025-11 · Pydantic AI Gateway went open beta and has since merged into Logfire; F5 added data-leakage detection to its AI Gateway via the LeakSignal acquisition. (Pydantic Logfire, F5)
- Trend · MCP gateways emerged as a distinct category; spend-limit enforcement became table stakes; the EU AI Act (enforceable Aug 2026) is driving the compliance bucket; new-api overtook one-api as the most active China-ecosystem relay.
🚀 Recent releases (auto-updated)
- 2026-06-11 · yym68686/uni-api v1.7.114 — Release 1.7.114
- 2026-06-11 · andrewyng/aisuite app-v0.1.0 — OpenCoworker 0.1.0
- 2026-06-09 · katanemo/plano 0.4.24 — 0.4.24
- 2026-06-09 · BerriAI/litellm v1.88.1 — v1.88.1
- 2026-06-08 · maximhq/bifrost ent-v1.4.8-base — Enterprise v1.4.8 base
- 2026-06-06 · envoyproxy/ai-gateway v0.7.0 — v0.7.0
- 2026-06-04 · kgateway-dev/kgateway v2.3.2 — v2.3.2
- 2026-06-04 · tensorzero/tensorzero 2026.6.0 — 2026.6.0
- 2026-06-04 · Kong/kong 3.9.2 — 3.9.2
- 2026-06-01 · mlflow/mlflow v3.13.0 — v3.13.0
- 2026-05-29 · tbphp/gpt-load v1.4.8 — v1.4.8
- 2026-05-26 · QuantumNous/new-api v1.0.0-rc.10 — v1.0.0-rc.10
How to choose safely
- Check the markup. Marketplaces charge 0–6% — for high volume, self-hosting or 0%-markup gateways (Vercel, Helicone cloud) pay for themselves fast.
- Verify model fidelity. Some relays silently downgrade models. Send a canary prompt (e.g. a known hard reasoning question) through the gateway and direct to the provider, then diff.
- Mind data flow. Every gateway sees your prompts. For sensitive data: self-host, or require ZDR (zero data retention) in writing.
- License check before embedding. new-api is AGPL-3.0; LiteLLM has an enterprise-licensed directory; "open core" ≠ everything free.
- Project health. Star count ≠ maintenance. Check last release date — several once-popular gateways (BricksLLM, Glide, RouteLLM) are effectively unmaintained; this list labels them.
- Avoid gray-market relays reselling reverse-engineered or stolen-quota access — account bans and data leaks are your risk, not theirs.
FAQ
What is an AI gateway (LLM gateway)?
A proxy between your code and LLM providers: one OpenAI-compatible endpoint and key for many models, adding routing, failover, caching, rate limits, cost tracking and guardrails. See the intro.
AI gateway vs LLM router — what's the difference?
A router decides which model gets each request (e.g. cheap vs flagship); a gateway is the full proxy layer (auth, caching, observability, guardrails) that usually includes routing. See smart routing.
What's the best open-source AI gateway?
LiteLLM is the default for breadth (Python, 100+ providers). For raw performance pick Bifrost (Go) or TensorZero (Rust); for enterprise K8s pick Kong or Higress. Full list under self-hosted.
LiteLLM vs OpenRouter — which should I use?
OpenRouter is hosted (zero ops, ~5.5% fee, 400+ models); LiteLLM is self-hosted (your keys, your infra, $0 markup). Hosted to start, self-host when volume justifies it. Cost math in the evaluation set.
What's the cheapest way to call many LLMs?
For zero ops: Vercel AI Gateway or Cloudflare AI Gateway (0% markup). For lowest token cost, route bulk work to cheap models — a 100K-token report runs $0.03 on DeepSeek vs $3.01 on GPT-5.5. See cost-first.
Are AI gateways safe? Who sees my prompts?
Every gateway sees your prompts. For sensitive data self-host or require zero-data-retention in writing; check the gateway scorecard for compliance/security ratings and known CVEs.
Contributing
Contributions welcome! Please read CONTRIBUTING.md first. Inclusion criteria, in short: the project must be an actual gateway/proxy/router for LLM or agent traffic (not an SDK wrapper or chat UI), publicly available, and active within the last 12 months — or clearly labeled as stale.
Star history
License
To the extent possible under law, the contributors have waived all copyright and related rights to this work.
Yorumlar (0)
Yorum birakmak icin giris yap.
Yorum birakSonuc bulunamadi
