omni-ai-mcp
Health Uyari
- License — License: MIT
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Low visibility — Only 8 GitHub stars
Code Gecti
- Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Gecti
- Permissions — No dangerous permissions requested
Bu listing icin henuz AI raporu yok.
Full-featured MCP server for Google Gemini. multiple tools: text generation with thinking mode, code review, web search with citations, RAG document search, native image generation (4K), Veo 3.1 video with audio, text-to-speech with 30 voices and others. Works with Claude Code
omni-ai-mcp
The complete AI bridge for Claude Code — Gemini's exclusive capabilities (video, TTS, 1M context, RAG, Deep Research) plus 400+ models via OpenRouter. One MCP server, every AI model, zero friction.
Why This Exists
Claude is exceptional at reasoning and code generation. But sometimes you need more:
- A second opinion from a different AI model (GPT-4o, Llama, Mistral, Claude via OpenRouter)
- Real-time web search with Google grounding and source citations
- Autonomous deep research that runs for minutes and produces structured reports from 40+ sources
- Video generation with Veo 3.1 — the only MCP server with native audio video generation
- Image generation with Gemini 3 Pro Image (Nano Banana Pro) up to 4K resolution
- Text-to-speech with 30 natural voices and multi-speaker support
- RAG for querying your own documents with citations
- Large codebase analysis with Gemini's 1M token context window
- Multi-turn conversations with cloud persistence (55-day retention, resume from any device)
- Access to 400+ models through one unified interface
omni-ai-mcp bridges Claude Code with Google Gemini and OpenRouter, enabling Claude to orchestrate any AI model as a tool.
What's New in v4.4.0
All model defaults are now aligned with the latest Gemini IDs, verified live against the Gemini API:
- Text Flash →
gemini-3.5-flash· Flash-Lite →gemini-3.1-flash-lite - Image →
gemini-3-pro-image(Nano Banana Pro) /gemini-3.1-flash-image(Nano Banana 2) - TTS →
gemini-3.1-flash-tts-preview - Deep Research →
deep-research-preview-04-2026(fixes the previous404 NOT_FOUNDagent)
Every default stays overridable via the GEMINI_MODEL_* environment variables.
Multi-Provider: Gemini + OpenRouter
Quick start
# Ask any of 400+ models — auto-routes from model name
ask_model("Explain quantum computing", model="openai/gpt-4o")
ask_model("Write a poem", model="meta-llama/llama-3.3-70b-instruct")
ask_model("Review this code", model="gemini-3.1-pro-preview") # auto-routes to Gemini native API
# If no Gemini key but OpenRouter key exists, Gemini models route via OpenRouter automatically
ask_model("Summarize this", model="gemini-3.5-flash") # -> google/ prefix on OpenRouter
# Discover all available models
gemini_list_models()
Dynamic Model Registry
No more hardcoded model IDs. The server discovers available models at runtime and always uses the latest. If a model is deprecated, it automatically falls back to the next available option.
# Override via env vars if needed:
export GEMINI_MODEL_PRO=gemini-3.1-pro-preview
export OPENROUTER_DEFAULT_MODEL=openai/gpt-4o
Smart Routing Rules
- Explicit Gemini model +
GEMINI_API_KEY-> always Gemini native API (fastest, cheapest) - Gemini model + no Gemini key +
OPENROUTER_API_KEY-> OpenRoutergoogle/prefix (automatic fallback) veo-*,imagen-*,deep-research-*models -> Gemini native only (no OpenRouter equivalent)- OpenRouter model (
openai/,meta-llama/, etc.) -> OpenRouter (requiresOPENROUTER_API_KEY)
PyPI Distribution
pip install omni-ai-mcp
omni-ai-mcp-setup # interactive setup wizard
Claude Desktop Extension (.dxt)
Install with one click — no Python setup required:
- Download
omni-ai-mcp-vX.Y.Z.dxtfrom GitHub Releases - Double-click the file (macOS/Windows) or drag it into Claude Desktop
- Enter your Gemini API key when prompted (OpenRouter key is optional)
- Done — all 20 tools are immediately available in Claude Desktop
The .dxt bundle includes all Python dependencies — users don't need to install anything else.
20 Tools
Multi-Provider
| Tool | Description |
|---|---|
ask_model |
Ask any AI: Gemini or 400+ models via OpenRouter — auto-routes from model name |
gemini_list_models |
Live model discovery: Gemini registry + OpenRouter catalog, deprecation warnings |
Text & Reasoning
| Tool | Description | Model |
|---|---|---|
ask_gemini |
Text generation with thinking mode, multi-turn, dual storage (local/cloud) | Gemini 3.1 Pro |
gemini_code_review |
Security, performance, and quality analysis | Gemini 3.1 Pro |
gemini_brainstorm |
Creative ideation with 6 methodologies (SCAMPER, TRIZ, etc.) | Gemini 3.1 Pro |
gemini_challenge |
Devil's advocate — find flaws in ideas, plans, and code | Gemini 3.1 Pro |
Code
| Tool | Description | Model |
|---|---|---|
gemini_analyze_codebase |
Whole-codebase analysis up to 1M tokens / 5MB | Gemini 3.1 Pro |
gemini_generate_code |
Structured code generation with dry-run preview and XML output | Gemini 3.1 Pro |
Research & Web
| Tool | Description | Model |
|---|---|---|
gemini_web_search |
Real-time search with Google grounding & citations | Gemini 3 Flash |
gemini_deep_research |
Autonomous 5-60 min research, 40+ sources, structured report | Deep Research Agent |
RAG
| Tool | Description |
|---|---|
gemini_file_search |
Query documents with citations |
gemini_create_file_store |
Create document stores |
gemini_upload_file |
Upload PDF, DOCX, code, etc. |
gemini_list_file_stores |
List available stores |
Media (Gemini exclusive)
| Tool | Description | Model |
|---|---|---|
gemini_analyze_image |
Vision: describe, OCR, Q&A on images | Gemini 3 Flash |
gemini_generate_image |
Imagen — up to 4K resolution | Gemini 3 Pro Image |
gemini_generate_video |
Veo 3.1 — 4-8s with native audio (dialogue, effects, ambient) | Veo 3.1 |
gemini_text_to_speech |
30 natural voices, multi-speaker dialogue | Gemini 2.5 Flash TTS |
Conversation
| Tool | Description |
|---|---|
gemini_list_conversations |
List history: title, mode, turns, last activity |
gemini_delete_conversation |
Delete by ID or title (partial match) |
Quick Start
Prerequisites
- Python 3.9+
- Claude Code CLI
- Gemini API key — get one free
- (Optional) OpenRouter API key — openrouter.ai for 400+ models
Install from PyPI (Recommended)
pip install omni-ai-mcp
omni-ai-mcp-setup
The setup wizard configures Claude Code automatically.
Install from Source
git clone https://github.com/marmyx77/omni-ai-mcp.git
cd omni-ai-mcp
# Gemini only
./setup.sh YOUR_GEMINI_API_KEY
# Gemini + OpenRouter (400+ models)
./setup.sh YOUR_GEMINI_API_KEY YOUR_OPENROUTER_KEY
Restart Claude Code. Verify:
claude mcp list
# omni-ai-mcp: Connected
Manual Install
pip install 'mcp[cli]>=1.0.0' 'google-genai>=2.0.0' pydantic defusedxml filelock
mkdir -p ~/.claude-mcp-servers/omni-ai-mcp
cp -r app/ run.py pyproject.toml ~/.claude-mcp-servers/omni-ai-mcp/
claude mcp add omni-ai-mcp --scope user \
-e GEMINI_API_KEY=YOUR_KEY \
-e OPENROUTER_API_KEY=YOUR_OR_KEY \
-- python3 ~/.claude-mcp-servers/omni-ai-mcp/run.py
Usage Examples
Multi-Model AI Orchestration
"Ask GPT-4o to review this authentication function"
-> ask_model(model="openai/gpt-4o", prompt="review this auth function...")
"Compare how Gemini and Llama respond to this design question"
-> ask_model(model="gemini-3.1-pro-preview", ...)
-> ask_model(model="meta-llama/llama-3.3-70b-instruct", ...)
"Get a Mistral opinion on this French legal document"
-> ask_model(model="mistralai/mistral-large-2512", ...)
Conversations with Memory
Gemini remembers previous context across calls via continuation_id:
# First turn
"Ask Gemini to analyze @src/auth.py for security issues"
# Returns: continuation_id: abc-123
# Follow-up — Gemini remembers the previous analysis
"Ask Gemini (continuation_id: abc-123) how to fix the SQL injection"
Dual Storage Mode
| Mode | Storage | Retention | Use |
|---|---|---|---|
local (default) |
SQLite | 3h (configurable) | Development, quick chats |
cloud |
Google Interactions API | 55 days | Long projects, cross-device |
# Start a named cloud conversation
"Ask Gemini (mode=cloud, title='Architecture Review'): Analyze my microservices design"
# Returns: continuation_id: int_v1_abc123...
# Resume from any device within 55 days
"Ask Gemini (continuation_id: int_v1_abc123...): What about the database layer?"
Deep Research
Autonomous research agent that runs 5-60 minutes:
"Deep research: Compare AI agent frameworks in 2025 — LangGraph, AutoGen, CrewAI"
The agent will:
- Plan a comprehensive research strategy
- Execute multiple targeted web searches
- Synthesize findings from 40+ sources
- Produce a structured report with citations
Use cases: market research, competitive analysis, technical deep dives, trend analysis, literature reviews.
Codebase Analysis
Leverage Gemini's 1M token context to analyze entire codebases at once:
"Analyze codebase src/**/*.py with focus on security"
"Analyze codebase ['app/', 'tests/'] — find architecture issues"
Analysis types: architecture, security, refactoring, documentation, dependencies, general
@File References
Include file contents directly in prompts:
"Ask Gemini to review @src/auth.py for security issues"
"Brainstorm improvements for @README.md"
"Code review @*.py with focus on performance"
"Analyze codebase @src/**/*.ts"
Supported patterns: @file.py, @src/main.py, @*.py, @src/**/*.ts, @. (directory listing)
Video Generation
"Generate a video of ocean waves at sunset, seagulls flying, sound of waves and wind"
- Duration: 4-8 seconds
- Resolution: 720p or 1080p (1080p requires 8s)
- Native audio: dialogue, sound effects, ambient sounds
- For dialogue: use quotes ("Hello," she said)
- For sounds: describe explicitly (engine roaring, birds chirping)
Image Generation
"Generate an image of a futuristic Tokyo street at night, neon lights reflecting on wet pavement,
cinematic, shot on 35mm lens"
- Resolution: up to 4K with Pro model
- Aspect ratios: 1:1, 16:9, 9:16, 3:2, 4:5, and more
- Use descriptive sentences, not keyword lists
Text-to-Speech
"Convert this article to speech using the Charon voice (informative, neutral)"
30 available voices — Bright: Zephyr, Autonoe / Upbeat: Puck, Laomedeia / Informative: Charon, Rasalgethi / Warm: Sulafat, Vindemiatrix / and 22 more.
Multi-speaker dialogue:
gemini_text_to_speech(
text="Host: Welcome!\nGuest: Thanks for having me!",
speakers=[
{"name": "Host", "voice": "Charon"},
{"name": "Guest", "voice": "Aoede"}
]
)
Image Analysis
"Analyze this screenshot and extract all visible text: @screenshot.png"
"Describe what's in this diagram and explain the architecture: @diagram.png"
Supported formats: PNG, JPG, JPEG, GIF, WEBP
RAG (Document Search)
# 1. Create a store
"Create a Gemini file store called 'project-docs'"
# 2. Upload files
"Upload the API specification PDF to the project-docs store"
# 3. Query
"Search the project-docs store: What are the rate limits?"
Challenge Tool
Get critical analysis before implementing — find flaws early:
"Challenge this plan with focus on security:
We'll store user sessions in localStorage and use MD5 for passwords"
The tool acts as a Devil's Advocate — it will NOT agree with you.
Focus areas: general, security, performance, maintainability, scalability, cost
Code Generation
"Generate a Python FastAPI endpoint for JWT authentication with refresh tokens"
Output is structured XML that Claude can apply directly:
<GENERATED_CODE>
<FILE action="create" path="src/auth.py">
# Complete implementation here...
</FILE>
</GENERATED_CODE>
Options: dry_run=true to preview without writing, language, style (production/prototype/minimal), output_dir
Thinking Mode
"Ask Gemini with high thinking level:
Design an optimal database schema for a social media platform at scale"
Levels: off (default), low (fast reasoning), high (deep analysis)
Model Selection
Text Models
| Alias | Resolved Model | Best For |
|---|---|---|
pro |
gemini-3.1-pro-preview |
Complex reasoning, coding, analysis |
flash |
gemini-3.5-flash |
Balanced speed/quality |
fast / flash-lite |
gemini-3.1-flash-lite |
High-volume, simple tasks |
Models are resolved dynamically at runtime — if a model is deprecated, the registry automatically falls back to the next available option.
OpenRouter Models (via ask_model)
| Provider | Example Model ID | Notes |
|---|---|---|
| OpenAI | openai/gpt-4o |
GPT-4o, o3, o4-mini |
| Meta | meta-llama/llama-3.3-70b-instruct |
Open source, fast |
| Anthropic | anthropic/claude-3.5-sonnet |
Claude via OpenRouter |
| Mistral | mistralai/mistral-large-2512 |
Strong on EU languages |
google/gemini-3.1-pro-preview |
Gemini via OpenRouter (fallback) | |
| 340+ more | — | gemini_list_models() to browse |
Configuration
All settings via environment variables:
| Variable | Default | Description |
|---|---|---|
GEMINI_API_KEY |
required | Google Gemini API key |
OPENROUTER_API_KEY |
— | OpenRouter key (enables ask_model for 400+ models) |
GEMINI_MODEL_PRO |
gemini-3.1-pro-preview |
Override Pro text model |
GEMINI_MODEL_FLASH |
gemini-3.5-flash |
Static fallback model |
GEMINI_MODEL_DEEP_RESEARCH |
deep-research-preview-04-2026 |
Override research agent |
OPENROUTER_DEFAULT_MODEL |
openai/gpt-4o |
Default OpenRouter model |
OPENROUTER_TIMEOUT |
120 |
OpenRouter generation timeout in seconds (raise for search models like perplexity/sonar-deep-research) |
GEMINI_SANDBOX_ROOT |
cwd | Root directory for file access |
GEMINI_SANDBOX_ENABLED |
true |
Enable path sandboxing |
GEMINI_MAX_FILE_SIZE |
102400 |
Max file size in bytes (100KB) |
GEMINI_CONVERSATION_TTL_HOURS |
3 |
Local conversation expiry |
GEMINI_CONVERSATION_MAX_TURNS |
50 |
Max turns per thread |
GEMINI_LOG_DIR |
~/.omni-ai-mcp |
Log & DB directory |
GEMINI_LOG_FORMAT |
text |
json or text |
GEMINI_DISABLED_TOOLS |
— | Comma-separated tool names to disable |
Claude Code Plugin
Slash Commands
Included in .claude/commands/ (auto-available in Claude Code when working inside this repo, or copy to ~/.claude/commands/ for global access):
| Command | Action |
|---|---|
/gemini <prompt> |
Ask Gemini Pro anything |
/gemini-research <topic> |
Autonomous deep research (40+ sources, 5-30 min) |
/gemini-review <file> |
Code review focused on bugs, security, performance |
/gemini-challenge <idea> |
Devil's Advocate — find flaws in a plan or architecture |
/gemini-analyze <path> |
Codebase analysis with 1M token context window |
/gemini-brainstorm <topic> |
Structured brainstorming (6 methodologies) |
/gemini-models |
List available models (Gemini + OpenRouter) |
/ask-model [model] <prompt> |
Ask any model: GPT-4o, Llama, Mistral, Gemini, etc. |
/cowork <task> |
Claude + Gemini working in parallel on the same task |
Subagents
Included in .claude/agents/ — Claude Code activates these automatically based on context:
| Agent | Trigger | Capability |
|---|---|---|
gemini-researcher |
"research X", "investigate Y", "find sources on Z" | Deep Research Agent, 40+ sources |
gemini-analyzer |
"analyze codebase", "security audit", "review architecture" | 1M token context window |
model-orchestrator |
"ask GPT-4o", "compare models", "use Llama for this" | Routes to 400+ models via OpenRouter |
cowork |
"second opinion", "verify with Gemini", "stress test this" | Claude + Gemini parallel analysis with synthesis |
Install globally (all projects)
cp -r .claude/commands/* ~/.claude/commands/
cp -r .claude/agents/* ~/.claude/agents/
Multi-Model Architecture
omni-ai-mcp uses Claude as the orchestrator with other models as tools:
User -> Claude Code
(orchestrates)
omni-ai-mcp tools
+-- ask_model("openai/gpt-4o") -> OpenRouter -> GPT-4o
+-- ask_model("meta-llama/...") -> OpenRouter -> Llama 3
+-- ask_model("gemini-3.1-pro...") -> Gemini API (native)
+-- ask_gemini(...) -> Gemini API -> Gemini Pro
+-- gemini_deep_research(...) -> Gemini API -> Deep Research Agent
+-- gemini_generate_video(...) -> Gemini API -> Veo 3.1
This is different from provider replacement tools like claude-code-router which replace Claude's backend entirely. omni-ai-mcp keeps Claude in control while giving it access to every AI model as a tool.
Architecture
omni-ai-mcp/
+-- app/
| +-- server.py # FastMCP -- 20 @mcp.tool() registrations
| +-- core/ # Config, structured logging, security
| +-- services/
| | +-- gemini.py # Gemini client + generate_with_fallback()
| | +-- model_registry.py # Dynamic model discovery (API + cache)
| | +-- openrouter.py # OpenRouter client (OpenAI-compatible)
| | +-- persistence.py # SQLite conversation storage + index
| +-- tools/ # Tool implementations by domain
| | +-- text/ # ask_gemini, ask_model, models, brainstorm, etc.
| | +-- code/ # analyze_codebase, generate_code
| | +-- media/ # image, video, TTS
| | +-- web/ # web_search, deep_research
| | +-- rag/ # file_store, file_search, upload
| +-- schemas/ # Pydantic v2 validation
| +-- utils/ # @file expansion, token estimation
+-- tests/ # 174+ tests (unit + integration)
+-- .claude/
| +-- commands/ # Slash commands
| +-- agents/ # Subagents
+-- setup.sh # One-command install
+-- manifest.json # DXT extension manifest (Claude Desktop)
+-- pyproject.toml
Security
- Path sandboxing: all file access restricted to
GEMINI_SANDBOX_ROOT - Secrets sanitization: API keys masked in logs (Google, AWS, GitHub, OpenAI, Anthropic, Slack, JWT, Bearer tokens)
- XML sanitization: LLM output sanitized before parsing to prevent injection
- Atomic file writes: automatic backups before any file modification
- Input validation: Pydantic v2 schemas at all tool boundaries
- Rate limiting: via provider-side quotas
Docker Deployment
# Build and run
docker-compose up -d
# With log viewer (port 8080)
docker-compose --profile monitoring up -d
Docker features: non-root user, read-only filesystem with tmpfs, health check every 30s, resource limits (2 CPU, 2GB RAM), log rotation.
Troubleshooting
MCP not connecting
claude mcp list # check registration
claude mcp remove omni-ai-mcp
./setup.sh YOUR_API_KEY # re-register
# Restart Claude Code
Import or syntax errors
python3 -c "from app.core.config import config; print(f'v{config.version}')"
python3 -m pytest tests/unit/ -q
Video / Image generation timeouts
- Video generation can take 1-6 minutes — this is normal
- Large images (4K) take longer
- Check your Gemini API quota at AI Studio
OpenRouter errors
- Verify
OPENROUTER_API_KEYis set and has credit - Check the model ID with
gemini_list_models(include_openrouter=True) - Use the exact model ID from the list
API key update
claude mcp remove omni-ai-mcp
claude mcp add omni-ai-mcp --scope user \
-e GEMINI_API_KEY=NEW_KEY \
-e OPENROUTER_API_KEY=NEW_OR_KEY \
-- python3 ~/.claude-mcp-servers/omni-ai-mcp/run.py
API Costs
| Feature | Notes |
|---|---|
| Text generation | Free tier available · $0.075-0.30 per 1M tokens |
| Web Search | ~$14 per 1000 queries |
| File Search indexing | $0.15 per 1M tokens (one-time) |
| Image generation | Varies by resolution and model |
| Video generation | Varies by duration/resolution |
| Text-to-speech | Varies by character count |
| OpenRouter | Per-model pricing — see openrouter.ai/models |
See Google AI pricing for Gemini rates.
Development
# Run tests
python -m pytest tests/unit/ -v
python -m pytest tests/integration/ -v # requires GEMINI_API_KEY
# Quick import check
python -c "from app.core.config import config; print(f'v{config.version}')"
# Reinstall after changes
rsync -a app/ ~/.claude-mcp-servers/omni-ai-mcp/app/
# Restart Claude Code
Adding a New Tool
- Create
app/tools/<domain>/my_tool.pywith@tool(name="...", description="...", input_schema=...) - Import in
app/tools/<domain>/__init__.py - Add Pydantic schema in
app/schemas/inputs.py - Write tests in
tests/unit/
See CLAUDE.md for the full development guide.
Changelog
v4.4.0
- Updated all model defaults to the latest Gemini IDs (verified live): flash →
gemini-3.5-flash, flash-lite →gemini-3.1-flash-lite, image →gemini-3-pro-image/gemini-3.1-flash-image, TTS →gemini-3.1-flash-tts-preview - Fixed deep research:
deep-research-pro-previewreturned404 NOT_FOUND→ nowdeep-research-preview-04-2026 - Registry fallbacks refreshed (
imagen-3.0→imagen-4.0, addedveo-3.1-lite)
v4.2.0
/gemini-illustrateand/gemini-imagecommands; richergemini_generate_imagetool description
v4.1.0
gemini_generate_imageimage editing (input_images);gemini_analyze_imagemulti-image +media_resolution
v4.0.1
- Fixed routing: Gemini models always use native API when key available (even if
provider=openrouter) - Added OpenRouter fallback for Gemini text models when no Gemini key (
google/prefix) - Fixed Python 3.11 f-string syntax in challenge tool
- Fixed stale unit test imports (103 -> 174 passing tests)
- Fixed model registry: corrected flash model names (
gemini-3-flash-preview,gemini-3.1-flash-lite-preview)
v4.0.0
ask_model: 400+ models via OpenRouter — auto-routes from model namegemini_list_models: live model discovery with deprecation warnings- Dynamic model registry: no more hardcoded model IDs
- PyPI distribution:
pip install omni-ai-mcp - Claude Code plugin: slash commands + subagents
- GitHub Actions CI/CD with Trusted Publishing
v3.3.0
- Dual storage mode for
ask_gemini: local (SQLite) or cloud (Interactions API, 55-day retention) gemini_list_conversations,gemini_delete_conversation- Cross-platform file locking
v3.2.0
gemini_deep_research: autonomous multi-step research (5-60 min, 40+ sources)
v3.0.0
- FastMCP migration, SQLite persistence, security hardening
Contributing
Contributions are welcome! Open an issue or pull request on GitHub.
License
MIT — see LICENSE
Yorumlar (0)
Yorum birakmak icin giris yap.
Yorum birakSonuc bulunamadi