open-deep-research-with-web-ui

agent
SUMMARY

πŸ” AI-powered deep research agent with Web UI. Built on smolagents, featuring process-based architecture, MetaSo/DuckDuckGo search, and Docker deployment. Apache 2.0 licensed.

README.md

Open Deep Research

License
Python
Docker

Read this in other languages: πŸ‡¨πŸ‡³ δΈ­ζ–‡ Β· πŸ‡«πŸ‡· FranΓ§ais Β· πŸ‡ͺπŸ‡Έ EspaΓ±ol

An open replication of OpenAI's Deep Research with a modern web UI β€” adapted from HuggingFace smolagents with simplified configuration for easy self-hosting.

Read more about the original implementation in the HuggingFace blog post.

This agent achieves 55% pass@1 on the GAIA validation set, compared to 67% for OpenAI's Deep Research.


Features

  • Parallel background research β€” fire off multiple research tasks simultaneously, monitor them independently, and come back to results later β€” even after closing the browser
  • Multi-agent research pipeline β€” Manager + search sub-agents with real-time streaming output
  • Modern Web UI β€” Preact-based SPA with collapsible sections, model selector, and copy support
  • Flexible model support β€” OpenAI, Anthropic, DeepSeek, Ollama, and any OpenAI-compatible provider
  • Multiple search engines β€” DuckDuckGo (free), SerpAPI, MetaSo with automatic fallback
  • Session history β€” SQLite-backed session storage with replay support
  • Three run modes β€” Live (real-time), Background (persistent), Auto-kill (one-shot)
  • Model auto-discovery β€” Detects available models from configured providers
  • Vision & media tools β€” Image QA, PDF analysis, audio transcription, YouTube transcripts
  • Production-ready β€” Docker, Gunicorn, multi-worker, health checks, configurable via JSON

Screenshots:

Web UI Input

Clean input interface with model selection

Agent Plans and Tools

Real-time display of agent reasoning, tool calls, and observations

Final Results

Highlighted final answer with collapsible sections


Parallel Background Research

Deep research tasks are slow β€” a single run can take 10–30 minutes. Most tools block the UI until the task completes, forcing you to wait.

This project takes a different approach: fire off as many research tasks as you want and let them run in the background β€” simultaneously.

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Question A: "What are the latest advances in LLMs?" β”‚  ← running
β”‚  Question B: "Compare top vector databases in 2025"  β”‚  ← running
β”‚  Question C: "EU AI Act compliance checklist"        β”‚  ← completed βœ“
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
        All visible in the sidebar. Click any to inspect.

How it works:

  1. Select Background or Auto-kill run mode (the default)
  2. Submit your first research question β€” the agent starts immediately in a subprocess
  3. The UI is not locked β€” submit a second question, a third, as many as you need
  4. Each agent runs independently, persisting all its reasoning steps and results to SQLite
  5. Use the sidebar to switch between running sessions in real-time
  6. Close the browser β€” in Background mode, agents keep running on the server
  7. Return later and click any session to replay the full research trace

Run mode comparison:

Mode Multiple at once Survives browser close UI locked
Background βœ… βœ… βœ—
Auto-kill βœ… βœ— (killed on tab close) βœ—
Live βœ— βœ— βœ…

This is particularly useful for:

  • Batch research workflows where you queue several related questions and review results together
  • Long-running queries where you don't want to keep a tab open
  • Teams sharing a self-hosted instance with multiple concurrent users

Why This Project?

There are several open-source Deep Research alternatives. Here's how this project compares:

Feature This project nickscamara/open-deep-research gpt-researcher langchain/open_deep_research smolagents
Docker / one-command deploy βœ… Pre-built image on GHCR βœ… Dockerfile βœ… Docker Compose ❌ Manual ❌ Library only
No-build frontend βœ… Preact + htm (no build step) ❌ Next.js build required ❌ Next.js build required ❌ LangGraph Studio β€”
Free search out of the box βœ… DuckDuckGo (no key needed) ❌ Firecrawl API required ⚠️ Key recommended ⚠️ Configurable βœ…
Model agnostic βœ… OpenAI, Anthropic, DeepSeek, Ollama βœ… AI SDK providers βœ… Multiple providers βœ… Configurable βœ…
Local model support βœ… Ollama, LM Studio ⚠️ Limited βœ… Ollama/Groq βœ… βœ…
Parallel background tasks βœ… Multiple simultaneous runs ❌ ❌ ❌ ❌
Session history / replay βœ… SQLite-backed ❌ ❌ ❌ ❌
Streaming UI βœ… SSE, 3 run modes βœ… Real-time activity βœ… WebSocket βœ… Type-safe stream ❌
Vision / image analysis βœ… PDF screenshots, visual QA ❌ ⚠️ Limited ❌ ⚠️
Audio / YouTube βœ… Transcription, speech ❌ ❌ ❌ ❌
GAIA benchmark score 55% pass@1 β€” β€” β€” 55% (original)

Key advantages of this project

  • Parallel background research β€” the most unique feature in this space. Start multiple deep research tasks at the same time β€” each runs as an independent subprocess, persists all events to SQLite, and can be monitored or replayed independently. Close the browser, come back hours later, and your results are waiting. No other open-source deep research tool supports this workflow.
  • Single docker run deployment β€” pre-built image on GHCR works on any platform with Docker: Linux, macOS, Windows, ARM, cloud VMs, Raspberry Pi.
  • No build step β€” the frontend uses Preact with htm template literals. No Node.js, no npm install, no webpack. Just open the browser.
  • Free by default β€” DuckDuckGo search requires no API key, so the agent works immediately after adding just one model API key.
  • Broader media support β€” handles PDFs, images, audio files, and YouTube transcripts that other projects leave to the user.

Quick Start

1. Clone the repository

git clone https://github.com/S2thend/open-deep-research-with-ui.git
cd open-deep-research-with-ui

2. Install system dependencies

The project requires FFmpeg for audio processing.

  • macOS: brew install ffmpeg
  • Linux: sudo apt-get install ffmpeg
  • Windows: choco install ffmpeg or download from ffmpeg.org

Verify: ffmpeg -version

3. Install Python dependencies

python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -e .

4. Configure

Copy the example config and add your API keys:

cp odr-config.example.json odr-config.json

Edit odr-config.json to set your model provider and API keys (see Configuration below).

5. Run

# Web UI (recommended)
python web_app.py
# Open http://localhost:5080

# CLI
python run.py --model-id "gpt-4o" "Your research question here"

Configuration

Configuration is managed via odr-config.json (preferred) or environment variables.

odr-config.json

Copy odr-config.example.json to odr-config.json and customize:

{
  "model": {
    "providers": [
      {
        "name": "openai",
        "api_key": "sk-...",
        "models": ["gpt-4o", "o1", "o3-mini"]
      }
    ],
    "default": "gpt-4o"
  },
  "search": {
    "providers": [
      { "name": "DDGS" },
      { "name": "META_SOTA", "api_key": "your_key" }
    ]
  }
}

The UI includes a built-in settings panel for client-side configuration. Server-side config is optionally protected by an admin password.

Environment variables

For Docker or environments where a config file isn't convenient, you can use .env:

cp .env.example .env
Variable Description
ENABLE_CONFIG_UI Enable admin config UI via web (false by default)
CONFIG_ADMIN_PASSWORD Password for server-side config changes
META_SOTA_API_KEY API key for MetaSo search
SERPAPI_API_KEY API key for SerpAPI search
DEBUG Enable debug logging (False by default)
LOG_LEVEL Log verbosity (INFO by default)

[!NOTE]
API keys set in odr-config.json take precedence over environment variables.

Supported Models

Supports OpenAI, Anthropic, DeepSeek, Ollama, and any OpenAI-compatible provider. Model routing is automatic based on model name prefix. Examples:

python run.py --model-id "gpt-4o" "Your question"
python run.py --model-id "o1" "Your question"
python run.py --model-id "claude-sonnet-4-6" "Your question"
python run.py --model-id "deepseek/deepseek-chat" "Your question"
python run.py --model-id "ollama/mistral" "Your question"  # local model

[!WARNING]
The o1 model requires OpenAI tier-3 API access: https://help.openai.com/en/articles/10362446-api-access-to-o1-and-o3-mini

Search Engines

Engine Key Required Notes
DDGS No Default, free DuckDuckGo
META_SOTA Yes MetaSo, often better for Chinese queries
SERPAPI Yes Google via SerpAPI

Multiple engines can be configured with automatic fallback β€” the agent tries them in order.


Usage

Web UI

python web_app.py
# or with custom host/port:
python web_app.py --port 8000 --host 0.0.0.0

Open http://localhost:5080 in your browser.

Run modes (available via the split-button in the UI):

Mode Behavior
Live Stream output in real-time; session ends on disconnect
Background Agent runs persistently; reconnect anytime to view results
Auto-kill Agent runs, session is cleaned up after completion

CLI

python run.py --model-id "gpt-4o" "What are the latest advances in quantum computing?"

GAIA Benchmark

# Requires HF_TOKEN for dataset download
python run_gaia.py --model-id "o1" --run-name my-run

Deployment

Docker (Recommended)

Pre-built images are available on GitHub Container Registry:

docker pull ghcr.io/s2thend/open-deep-research-with-ui:latest

docker run -d \
  --env-file .env \
  -v ./odr-config.json:/app/odr-config.json \
  -p 5080:5080 \
  --name open-deep-research \
  ghcr.io/s2thend/open-deep-research-with-ui:latest

Docker Compose (includes volume for downloaded files):

cp .env.example .env        # configure API keys
cp odr-config.example.json odr-config.json  # configure models
docker-compose up -d
docker-compose logs -f      # follow logs
docker-compose down         # stop

Build your own image:

docker build -t open-deep-research .
docker run -d --env-file .env -p 5080:5080 open-deep-research

[!WARNING]
Never commit .env or odr-config.json with real API keys to git. Always pass secrets at runtime.

Gunicorn (Production)

pip install -e .
gunicorn -c gunicorn.conf.py web_app:app

The included gunicorn.conf.py is pre-configured with:

  • Multi-worker process management
  • 300s timeout for long-running agent tasks
  • Proper logging and error handling

Architecture

Agent Pipeline

User Question
    β”‚
    β–Ό
Manager Agent (CodeAgent / ToolCallingAgent)
    β”‚  Plans multi-step research strategy
    β”œβ”€β”€β–Ά Search Sub-Agent Γ— N
    β”‚       β”‚  Web search β†’ browse β†’ extract
    β”‚       └──▢ Tools: DuckDuckGo/SerpAPI/MetaSo, VisitWebpage,
    β”‚                   TextInspector, VisualQA, YoutubeTranscript
    β”‚
    └──▢ Final Answer synthesis

Streaming Pipeline

run.py  (step_callbacks β†’ JSON-lines on stdout)
  β”‚
  β–Ό
web_app.py  (subprocess β†’ Server-Sent Events)
  β”‚
  β–Ό
Browser  (Preact components β†’ DOM)

SSE event types:

Event Description
planning_step Agent reasoning and plan
code_running Code being executed
action_step Tool call + observation
final_answer Completed research result
error Error with details

DOM Hierarchy

#output
β”œβ”€β”€ step-container.plan-step       (manager plan)
β”œβ”€β”€ step-container                 (manager step)
β”‚   └── step-children
β”‚       β”œβ”€β”€ model-output           (reasoning)
β”‚       β”œβ”€β”€ Agent Call             (code, collapsed)
β”‚       └── sub-agent-container
β”‚           β”œβ”€β”€ step-container.plan-step  (sub-agent plan)
β”‚           β”œβ”€β”€ step-container            (sub-agent steps)
β”‚           └── sub-agent-result          (preview + collapsible)
└── final_answer                   (prominent result block)

Reproducibility (GAIA Results)

The 55% pass@1 result on GAIA was obtained with augmented data:

  • Single-page PDFs and XLS files were opened and screenshotted as .png
  • The file loader checks for a .png version of each attachment and prefers it

The augmented dataset is available at smolagents/GAIA-annotated (access granted instantly on request).


Development

pip install -e ".[dev]"   # includes testing, linting, type checking tools
python web_app.py         # starts dev server with auto-reload

The frontend is a dependency-free Preact app using htm for JSX-like templates β€” no build step required. Edit files in static/js/components/ and refresh.


License

Licensed under the Apache License 2.0 β€” the same license as smolagents.

See LICENSE for details.

Acknowledgments:

  • Original research agent implementation by HuggingFace smolagents
  • Web UI, session management, streaming architecture, and configuration system added in this fork

Reviews (0)

No results found