llm-interactive-proxy
Health Pass
- License — License: AGPL-3.0
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Community trust — 14 GitHub stars
Code Pass
- Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Pass
- Permissions — No dangerous permissions requested
This tool acts as a universal translation, routing, and control layer. It allows you to connect various AI-powered client applications to multiple different inference backends and providers through a single local or shared endpoint.
Security Assessment
As a local proxy designed to intercept and route AI traffic, the application inherently handles potentially sensitive data, such as your prompts, code context, and API replies. It must also make network requests to communicate with external LLM provider APIs. However, the automated code scan of 12 files found no hardcoded secrets or dangerous patterns, and the tool requests no dangerous local permissions. Because it manages API keys and routes confidential intelligence, overall risk is rated as Medium.
Quality Assessment
The project is actively maintained, with its most recent code push occurring just today. It is protected under the AGPL-3.0 license, ensuring it remains open-source. The repository features a robust and detailed README, maintains continuous integration workflows, and reports exceptionally high test coverage (over 13,000 tests passing). While the community is currently small with 14 GitHub stars, the engineering practices demonstrated appear to be of high quality.
Verdict
Safe to use, provided you understand that your API keys and prompt traffic will pass through this proxy layer.
Connect any LLM-powered client app, such as a coding agent, to any supported inference backend/model.
LLM Interactive Proxy
Turn any compatible AI client into a safer, smarter, multi-provider agent platform.
LLM Interactive Proxy is a universal translation, routing, and control layer for modern AI clients. Point OpenAI-compatible apps, Anthropic tools, Gemini integrations, and agentic coding workflows at one local or shared endpoint, then gain routing, failover, built-in security, automated steering, session intelligence, observability, and cross-provider flexibility without rewriting your client.
If your current setup feels fragile, expensive, opaque, or locked to one vendor, this project is designed to change that.
It is a compatibility layer, a security layer, a traffic control plane, a debugging surface, and a workflow improver for serious agentic use.
- Keep your existing clients - Change the endpoint, not the app.
- Mix providers freely - Route across APIs, plans, OAuth accounts, model families, and protocol styles.
- Control agents in production - Add guardrails, rewrites, diagnostics, and policy at the proxy layer.
- Debug with evidence - Inspect exact wire traffic instead of guessing from symptoms.
| Without the proxy | With LLM Interactive Proxy |
|---|---|
| Each client is tied to one provider stack | One endpoint can serve many clients and many backend families |
| Provider switching often means code or config churn | Change routing instead of rewriting integrations |
| Agent safety is scattered across tools | Centralize redaction, tool controls, sandboxing, and command protection |
| Debugging depends on incomplete logs | Inspect exact wire traffic with captures and diagnostics |
| Token costs grow with long sessions | Use intelligent context compression and smarter routing to reduce spend |
| Protocol mismatch blocks experimentation | Use cross-protocol conversion to bridge Anthropic, OpenAI, Gemini, and more |
Table Of Contents
- Quick Start
- What Makes It Different
- Perfect For
- Feature Highlights
- Core Advantages
- Common Use Cases
- Supported Frontend Interfaces
- Supported Backends
- Access Modes
- Architecture
- Documentation Map
At a glance
Beyond basic forwarding, the proxy adds cross-protocol translation, tool safety, routing and failover, session-oriented features (including B2BUA-style handling), byte-precise CBOR captures, and usage tracking. Longer narratives, use-case lists, and feature tours live in the User Guide.
Quick Start
1. Clone and install
git clone https://github.com/matdev83/llm-interactive-proxy.git
cd llm-interactive-proxy
python -m venv .venv
# Windows
.venv\Scripts\activate
# Linux/macOS
source .venv/bin/activate
python -m pip install -e .[dev]
If you want OAuth-oriented optional connectors, install the oauth extra:
python -m pip install -e .[dev,oauth]
2. Export at least one provider credential
# Example: OpenAI
export OPENAI_API_KEY="your-key-here"
3. Start the proxy
python -m src.core.cli --default-backend openai:gpt-4o
The proxy listens on http://localhost:8000 by default.
4. Point your client at the proxy instead of the vendor
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8000/v1",
api_key="dummy-key",
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)
See the full Quick Start Guide for additional setup, auth, and backend examples.
First user message appender (per session)
Optional once-per-session suffix on the first user message (HTTP chat): auto_append_first_prompt_filename in config (.txt/.md), AUTO_APPEND_FIRST_PROMPT_FILENAME, or --auto-append-first-prompt-filename. File must exist at startup; contents are read once into memory (restart to reload). At default log level, startup logs confirm load; each session logs once when the suffix is merged. Applied after redaction on the outbound request only (history stays pre-transform, like redaction). Skipped for auxiliary-routed calls.
Supported Frontend Interfaces
The proxy exposes standard API surfaces so existing clients can often work with little or no code changes.
- OpenAI Chat Completions -
/v1/chat/completions - OpenAI Responses -
/v1/responses - OpenAI Models -
/v1/models - Anthropic Messages -
/anthropic/v1/messages - Dedicated Anthropic server -
http://host:<anthropic_port>/v1/messages(only whenanthropic_port/--anthropic-port/ANTHROPIC_PORTis set; often8001) - Google Gemini v1beta -
/v1beta/modelsand:generateContent - Diagnostics endpoint -
/v1/diagnostics - Backend reactivation endpoint -
/v1/diagnostics/backends/{backend_instance}/reactivate
See Frontend API documentation for protocol details and compatibility notes.
Supported Backends
The backend catalog keeps growing. Current documented backends include:
- OpenAI
- Anthropic
- Google Gemini
- OpenRouter
- Nvidia
- ZAI (Zhipu AI)
- Alibaba Qwen
- MiniMax
- InternLM
- ZenMux
- Moonshot AI / Kimi Code
- Hybrid backend
- Cline
- Antigravity OAuth
See the full Backends Overview for configuration and provider-specific notes.
Routing Selector Semantics
backend:modelselects an explicit backend family.backend-instance:modelsuch asopenai.1:gpt-4otargets a concrete backend instance.modelandvendor/modelare model-only selectors.vendor/model:variantremains model-only unless:appears before the first/.- URI-style parameters in selectors such as
model?temperature=0.5are parsed and propagated through routing metadata. - Explicit-backend configuration and command surfaces such as
--static-route, replacement targets, and one-off routing require strictbackend:modelformat.
Access Modes
The proxy supports two operational modes with different security assumptions:
- Single User Mode - Default local-development mode with localhost-first behavior and support for OAuth connectors.
- Multi User Mode - Shared or production mode with stronger authentication expectations and tighter connector rules.
Quick examples:
# Single User Mode
python -m src.core.cli
# Multi User Mode
python -m src.core.cli --multi-user-mode --host=0.0.0.0 --api-keys key1,key2
See Access Modes for the security model and deployment guidance.
Architecture
graph TD
subgraph "Clients"
A[OpenAI Client]
B[OpenAI Responses Client]
C[Anthropic Client]
D[Gemini Client]
E[Any LLM App or Agent]
end
subgraph "LLM Interactive Proxy"
FE[Frontend APIs]
Core[Routing Translation Safety Observability]
BE[Backend Connectors]
FE --> Core --> BE
end
subgraph "Providers"
P1[OpenAI]
P2[Anthropic]
P3[Gemini]
P4[OpenRouter]
P5[Other Backends]
end
A --> FE
B --> FE
C --> FE
D --> FE
E --> FE
BE --> P1
BE --> P2
BE --> P3
BE --> P4
BE --> P5
The proxy sits between the client and the provider, which is exactly why it can translate protocols, enforce policy, capture traffic, and route requests without forcing your app to change its calling pattern.
Documentation Map
- Quick Start - Get running fast
- User Guide - End-user documentation and feature catalog
- Configuration Guide - Flags, config, and operational settings
- Frontend Overview - Choose the right API surface
- Backends Overview - Provider setup and switching
- Security Docs - Authentication and key-handling guidance
- Development Guide - Architecture, local development, testing, and contributing
- CHANGELOG - Release history
- CONTRIBUTING - Contribution guidelines
Development
# Run the test suite
python -m pytest
# Lint and auto-fix
python -m ruff check --fix .
# Format
python -m black .
See the Development Guide for architecture, contribution workflow, and extra dev scripts.
Support
GitHub Issues and Discussions.
License
This project is licensed under the GNU AGPL v3.0 or later.
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found