open-deep-research-with-web-ui
π AI-powered deep research agent with Web UI. Built on smolagents, featuring process-based architecture, MetaSo/DuckDuckGo search, and Docker deployment. Apache 2.0 licensed.
Open Deep Research
Read this in other languages: π¨π³ δΈζ Β· π«π· FranΓ§ais Β· πͺπΈ EspaΓ±ol
An open replication of OpenAI's Deep Research with a modern web UI β adapted from HuggingFace smolagents with simplified configuration for easy self-hosting.
Read more about the original implementation in the HuggingFace blog post.
This agent achieves 55% pass@1 on the GAIA validation set, compared to 67% for OpenAI's Deep Research.
Features
- Parallel background research β fire off multiple research tasks simultaneously, monitor them independently, and come back to results later β even after closing the browser
- Multi-agent research pipeline β Manager + search sub-agents with real-time streaming output
- Modern Web UI β Preact-based SPA with collapsible sections, model selector, and copy support
- Flexible model support β OpenAI, Anthropic, DeepSeek, Ollama, and any OpenAI-compatible provider
- Multiple search engines β DuckDuckGo (free), SerpAPI, MetaSo with automatic fallback
- Session history β SQLite-backed session storage with replay support
- Three run modes β Live (real-time), Background (persistent), Auto-kill (one-shot)
- Model auto-discovery β Detects available models from configured providers
- Vision & media tools β Image QA, PDF analysis, audio transcription, YouTube transcripts
- Production-ready β Docker, Gunicorn, multi-worker, health checks, configurable via JSON
Screenshots:
Clean input interface with model selection
Real-time display of agent reasoning, tool calls, and observations
Highlighted final answer with collapsible sections
Parallel Background Research
Deep research tasks are slow β a single run can take 10β30 minutes. Most tools block the UI until the task completes, forcing you to wait.
This project takes a different approach: fire off as many research tasks as you want and let them run in the background β simultaneously.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Question A: "What are the latest advances in LLMs?" β β running
β Question B: "Compare top vector databases in 2025" β β running
β Question C: "EU AI Act compliance checklist" β β completed β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
All visible in the sidebar. Click any to inspect.
How it works:
- Select Background or Auto-kill run mode (the default)
- Submit your first research question β the agent starts immediately in a subprocess
- The UI is not locked β submit a second question, a third, as many as you need
- Each agent runs independently, persisting all its reasoning steps and results to SQLite
- Use the sidebar to switch between running sessions in real-time
- Close the browser β in Background mode, agents keep running on the server
- Return later and click any session to replay the full research trace
Run mode comparison:
| Mode | Multiple at once | Survives browser close | UI locked |
|---|---|---|---|
| Background | β | β | β |
| Auto-kill | β | β (killed on tab close) | β |
| Live | β | β | β |
This is particularly useful for:
- Batch research workflows where you queue several related questions and review results together
- Long-running queries where you don't want to keep a tab open
- Teams sharing a self-hosted instance with multiple concurrent users
Why This Project?
There are several open-source Deep Research alternatives. Here's how this project compares:
| Feature | This project | nickscamara/open-deep-research | gpt-researcher | langchain/open_deep_research | smolagents |
|---|---|---|---|---|---|
| Docker / one-command deploy | β Pre-built image on GHCR | β Dockerfile | β Docker Compose | β Manual | β Library only |
| No-build frontend | β Preact + htm (no build step) | β Next.js build required | β Next.js build required | β LangGraph Studio | β |
| Free search out of the box | β DuckDuckGo (no key needed) | β Firecrawl API required | β οΈ Key recommended | β οΈ Configurable | β |
| Model agnostic | β OpenAI, Anthropic, DeepSeek, Ollama | β AI SDK providers | β Multiple providers | β Configurable | β |
| Local model support | β Ollama, LM Studio | β οΈ Limited | β Ollama/Groq | β | β |
| Parallel background tasks | β Multiple simultaneous runs | β | β | β | β |
| Session history / replay | β SQLite-backed | β | β | β | β |
| Streaming UI | β SSE, 3 run modes | β Real-time activity | β WebSocket | β Type-safe stream | β |
| Vision / image analysis | β PDF screenshots, visual QA | β | β οΈ Limited | β | β οΈ |
| Audio / YouTube | β Transcription, speech | β | β | β | β |
| GAIA benchmark score | 55% pass@1 | β | β | β | 55% (original) |
Key advantages of this project
- Parallel background research β the most unique feature in this space. Start multiple deep research tasks at the same time β each runs as an independent subprocess, persists all events to SQLite, and can be monitored or replayed independently. Close the browser, come back hours later, and your results are waiting. No other open-source deep research tool supports this workflow.
- Single
docker rundeployment β pre-built image on GHCR works on any platform with Docker: Linux, macOS, Windows, ARM, cloud VMs, Raspberry Pi. - No build step β the frontend uses Preact with
htmtemplate literals. No Node.js, nonpm install, no webpack. Just open the browser. - Free by default β DuckDuckGo search requires no API key, so the agent works immediately after adding just one model API key.
- Broader media support β handles PDFs, images, audio files, and YouTube transcripts that other projects leave to the user.
Quick Start
1. Clone the repository
git clone https://github.com/S2thend/open-deep-research-with-ui.git
cd open-deep-research-with-ui
2. Install system dependencies
The project requires FFmpeg for audio processing.
- macOS:
brew install ffmpeg - Linux:
sudo apt-get install ffmpeg - Windows:
choco install ffmpegor download from ffmpeg.org
Verify: ffmpeg -version
3. Install Python dependencies
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -e .
4. Configure
Copy the example config and add your API keys:
cp odr-config.example.json odr-config.json
Edit odr-config.json to set your model provider and API keys (see Configuration below).
5. Run
# Web UI (recommended)
python web_app.py
# Open http://localhost:5080
# CLI
python run.py --model-id "gpt-4o" "Your research question here"
Configuration
Configuration is managed via odr-config.json (preferred) or environment variables.
odr-config.json
Copy odr-config.example.json to odr-config.json and customize:
{
"model": {
"providers": [
{
"name": "openai",
"api_key": "sk-...",
"models": ["gpt-4o", "o1", "o3-mini"]
}
],
"default": "gpt-4o"
},
"search": {
"providers": [
{ "name": "DDGS" },
{ "name": "META_SOTA", "api_key": "your_key" }
]
}
}
The UI includes a built-in settings panel for client-side configuration. Server-side config is optionally protected by an admin password.
Environment variables
For Docker or environments where a config file isn't convenient, you can use .env:
cp .env.example .env
| Variable | Description |
|---|---|
ENABLE_CONFIG_UI |
Enable admin config UI via web (false by default) |
CONFIG_ADMIN_PASSWORD |
Password for server-side config changes |
META_SOTA_API_KEY |
API key for MetaSo search |
SERPAPI_API_KEY |
API key for SerpAPI search |
DEBUG |
Enable debug logging (False by default) |
LOG_LEVEL |
Log verbosity (INFO by default) |
[!NOTE]
API keys set inodr-config.jsontake precedence over environment variables.
Supported Models
Supports OpenAI, Anthropic, DeepSeek, Ollama, and any OpenAI-compatible provider. Model routing is automatic based on model name prefix. Examples:
python run.py --model-id "gpt-4o" "Your question"
python run.py --model-id "o1" "Your question"
python run.py --model-id "claude-sonnet-4-6" "Your question"
python run.py --model-id "deepseek/deepseek-chat" "Your question"
python run.py --model-id "ollama/mistral" "Your question" # local model
[!WARNING]
Theo1model requires OpenAI tier-3 API access: https://help.openai.com/en/articles/10362446-api-access-to-o1-and-o3-mini
Search Engines
| Engine | Key Required | Notes |
|---|---|---|
DDGS |
No | Default, free DuckDuckGo |
META_SOTA |
Yes | MetaSo, often better for Chinese queries |
SERPAPI |
Yes | Google via SerpAPI |
Multiple engines can be configured with automatic fallback β the agent tries them in order.
Usage
Web UI
python web_app.py
# or with custom host/port:
python web_app.py --port 8000 --host 0.0.0.0
Open http://localhost:5080 in your browser.
Run modes (available via the split-button in the UI):
| Mode | Behavior |
|---|---|
| Live | Stream output in real-time; session ends on disconnect |
| Background | Agent runs persistently; reconnect anytime to view results |
| Auto-kill | Agent runs, session is cleaned up after completion |
CLI
python run.py --model-id "gpt-4o" "What are the latest advances in quantum computing?"
GAIA Benchmark
# Requires HF_TOKEN for dataset download
python run_gaia.py --model-id "o1" --run-name my-run
Deployment
Docker (Recommended)
Pre-built images are available on GitHub Container Registry:
docker pull ghcr.io/s2thend/open-deep-research-with-ui:latest
docker run -d \
--env-file .env \
-v ./odr-config.json:/app/odr-config.json \
-p 5080:5080 \
--name open-deep-research \
ghcr.io/s2thend/open-deep-research-with-ui:latest
Docker Compose (includes volume for downloaded files):
cp .env.example .env # configure API keys
cp odr-config.example.json odr-config.json # configure models
docker-compose up -d
docker-compose logs -f # follow logs
docker-compose down # stop
Build your own image:
docker build -t open-deep-research .
docker run -d --env-file .env -p 5080:5080 open-deep-research
[!WARNING]
Never commit.envorodr-config.jsonwith real API keys to git. Always pass secrets at runtime.
Gunicorn (Production)
pip install -e .
gunicorn -c gunicorn.conf.py web_app:app
The included gunicorn.conf.py is pre-configured with:
- Multi-worker process management
- 300s timeout for long-running agent tasks
- Proper logging and error handling
Architecture
Agent Pipeline
User Question
β
βΌ
Manager Agent (CodeAgent / ToolCallingAgent)
β Plans multi-step research strategy
ββββΆ Search Sub-Agent Γ N
β β Web search β browse β extract
β ββββΆ Tools: DuckDuckGo/SerpAPI/MetaSo, VisitWebpage,
β TextInspector, VisualQA, YoutubeTranscript
β
ββββΆ Final Answer synthesis
Streaming Pipeline
run.py (step_callbacks β JSON-lines on stdout)
β
βΌ
web_app.py (subprocess β Server-Sent Events)
β
βΌ
Browser (Preact components β DOM)
SSE event types:
| Event | Description |
|---|---|
planning_step |
Agent reasoning and plan |
code_running |
Code being executed |
action_step |
Tool call + observation |
final_answer |
Completed research result |
error |
Error with details |
DOM Hierarchy
#output
βββ step-container.plan-step (manager plan)
βββ step-container (manager step)
β βββ step-children
β βββ model-output (reasoning)
β βββ Agent Call (code, collapsed)
β βββ sub-agent-container
β βββ step-container.plan-step (sub-agent plan)
β βββ step-container (sub-agent steps)
β βββ sub-agent-result (preview + collapsible)
βββ final_answer (prominent result block)
Reproducibility (GAIA Results)
The 55% pass@1 result on GAIA was obtained with augmented data:
- Single-page PDFs and XLS files were opened and screenshotted as
.png - The file loader checks for a
.pngversion of each attachment and prefers it
The augmented dataset is available at smolagents/GAIA-annotated (access granted instantly on request).
Development
pip install -e ".[dev]" # includes testing, linting, type checking tools
python web_app.py # starts dev server with auto-reload
The frontend is a dependency-free Preact app using htm for JSX-like templates β no build step required. Edit files in static/js/components/ and refresh.
License
Licensed under the Apache License 2.0 β the same license as smolagents.
See LICENSE for details.
Acknowledgments:
- Original research agent implementation by HuggingFace smolagents
- Web UI, session management, streaming architecture, and configuration system added in this fork
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found