webbrain

agent
Security Audit
Fail
Health Warn
  • License — License: MIT
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Low visibility — Only 5 GitHub stars
Code Fail
  • network request — Outbound network request in src/chrome/src/agent/tools.js
  • network request — Outbound network request in src/chrome/src/cdp/image-utils.js
  • exec() — Shell command execution in src/chrome/src/content/accessibility-tree.js
  • new Function() — Dynamic code execution via Function constructor in src/chrome/src/content/content.js
  • network request — Outbound network request in src/chrome/src/network/network-tools.js
  • network request — Outbound network request in src/chrome/src/offscreen/offscreen.js
Permissions Pass
  • Permissions — No dangerous permissions requested
Purpose
This tool is an open-source browser extension that acts as an AI agent, allowing users to chat with web pages and automate multi-step browsing tasks using various local or cloud-based Large Language Models.

Security Assessment
Risk Rating: High. The extension inherently processes sensitive data by reading your browser activity, webpage content, and LLM prompts. The automated audit uncovered critical security vulnerabilities, specifically dynamic code execution via the Function constructor and shell command execution capabilities. Malicious web pages or compromised LLM outputs could potentially trigger these execution flaws to run unauthorized commands on your machine. Additionally, the tool makes multiple outbound network requests to handle tool operations, fetch images, and communicate with external LLM providers. While no hardcoded secrets or dangerous baseline system permissions were found, the combination of broad webpage access, dynamic code execution, and network activity presents a severe attack surface.

Quality Assessment
The project uses the permissive MIT license, meaning it is entirely open-source and legally safe to modify. The repository is very new and actively maintained, with the most recent code push occurring today. However, it currently has extremely low community visibility and adoption, boasting only 5 GitHub stars. Because of this minimal public scrutiny, critical security flaws and edge cases are highly likely to go unnoticed by the broader developer community.

Verdict
Not recommended for daily use until the dynamic code execution and shell access vulnerabilities are thoroughly explained by the author or patched out.
SUMMARY

Open-source AI browser agent for Chrome and Firefox

README.md

WebBrain

Open-source AI browser agent for Chrome and Firefox. Chat with any web page, automate browser tasks, and run multi-step agent workflows — powered by your choice of LLM.

Features

  • Page Reading — Extracts text, links, forms, tables, and interactive elements from any page
  • Browser Actions — Click, type, scroll, navigate, and interact with page elements
  • Ask / Act Modes — Read-only mode by default, full agent mode with confirmation
  • Multi-Step Agent — Autonomous task execution with tool-use loops (configurable, default 25 steps)
  • Continue from Limit — When the agent hits the step limit, click Continue to keep going
  • Multi-Provider LLM — Supports local and cloud models:
    • llama.cpp (local, default) — No API key needed
    • OpenAI (GPT-4o, etc.)
    • OpenRouter (access 100+ models)
    • Anthropic Claude (native API)
  • Side Panel UI — Clean chat interface that lives alongside your browsing
  • Per-Tab Conversations — Each tab has its own chat history
  • Streaming — Real-time token streaming from all providers
  • Smart Context — Automatic context trimming, tool result limits, and emergency overflow recovery
  • Copy Support — Copy buttons on code blocks and full messages
  • Page Inspection Banner — Visual indicator when the agent is interacting with the page
  • Stop Button — Abort the agent mid-execution at any time

Quick Start

Chrome

git clone https://github.com/esokullu/webbrain.git
  1. Open Chrome → chrome://extensions/
  2. Enable Developer mode (top right)
  3. Click Load unpacked → select the webbrain folder

Firefox

git clone https://github.com/esokullu/webbrain.git
  1. Open Firefox → about:debugging#/runtime/this-firefox
  2. Click Load Temporary Add-on
  3. Navigate to the webbrain-firefox folder and select manifest.json

Note: Temporary add-ons are removed when Firefox restarts. For permanent installation, the extension needs to be signed via addons.mozilla.org.

Start a local LLM (default)

# Using llama.cpp
llama-server -m your-model.gguf --port 8080

# Or using Ollama (OpenAI-compatible)
ollama serve
# Then set base URL to http://localhost:11434/v1 in settings

Use it

Click the WebBrain icon → the side panel opens. Type a message like:

  • "Summarize this page"
  • "Find all links about pricing"
  • "Fill in the search box with 'AI agents' and click Search"
  • "Navigate to github.com and find trending repositories"

Configuration

Click the gear icon or go to the extension's Options page to configure:

Display Settings:

  • Verbose Mode — Show full tool call JSON (off by default)
  • Screenshot Fallback — Use screenshots when DOM reading fails
  • Max Agent Steps — Configurable step limit (5-50, default 25)

Providers:

Provider Base URL API Key
llama.cpp http://localhost:8080 Not needed
OpenAI https://api.openai.com/v1 Required
OpenRouter https://openrouter.ai/api/v1 Required
Anthropic https://api.anthropic.com Required

Architecture

webbrain/                          webbrain-firefox/
├── manifest.json (MV3)            ├── manifest.json (MV2)
├── src/                           ├── src/
│   ├── background.js              │   ├── background.js (+ background.html)
│   ├── agent/                     │   ├── agent/
│   │   ├── agent.js               │   │   ├── agent.js
│   │   └── tools.js               │   │   └── tools.js
│   ├── content/                   │   ├── content/
│   │   └── content.js             │   │   └── content.js
│   ├── providers/                 │   ├── providers/
│   │   ├── base.js                │   │   ├── base.js
│   │   ├── llamacpp.js            │   │   ├── llamacpp.js
│   │   ├── openai.js              │   │   ├── openai.js
│   │   ├── anthropic.js           │   │   ├── anthropic.js
│   │   └── manager.js             │   │   └── manager.js
│   └── ui/                        │   └── ui/
│       ├── sidepanel.html         │       ├── sidepanel.html
│       ├── sidepanel.js           │       ├── sidepanel.js
│       ├── settings.html          │       ├── settings.html
│       └── settings.js            │       └── settings.js
├── styles/                        ├── styles/
│   └── sidepanel.css              │   └── sidepanel.css
├── web/                           └── icons/
│   ├── index.html
│   └── vercel.json
└── icons/

Key difference: Chrome uses Manifest V3 (service worker, chrome.scripting, sidePanel API), Firefox uses Manifest V2 (background page, browser.tabs.executeScript, sidebar_action).

Agent Tools

Tool Ask Mode Act Mode Description
read_page Yes Yes Extract page text, links, forms
screenshot Yes Yes Capture visible tab
get_interactive_elements Yes Yes List all clickable/interactive elements
scroll Yes Yes Scroll the page
extract_data Yes Yes Extract tables, headings, images
get_selection Yes Yes Get highlighted text
click No Yes Click elements by selector, index, or coordinates
type_text No Yes Type into input fields
navigate No Yes Go to a URL
wait_for_element No Yes Wait for a selector to appear
execute_js No Yes Run custom JavaScript
new_tab No Yes Open a new tab
fetch_url Yes Yes Fetch a URL from the background with the user's cookies. Best for JSON APIs, READMEs, plain HTML.
research_url Yes Yes Open a URL in a hidden tab, wait for JS rendering, return main content. Best for SPAs.
list_downloads Yes Yes List recent downloads with status and source URLs.
read_downloaded_file No Yes Re-fetch a downloaded file's content (text or base64).
download_file No Yes Download a single file from a URL.
download_files No Yes Download multiple files in parallel (max 3 concurrent).
download_resource_from_page No Yes Download an <img>/<video>/blob URL from the current page.
iframe_read / iframe_click / iframe_type No Yes Read/click/type inside cross-origin iframes (Stripe, embedded forms).
done Yes Yes Signal task completion

Slash Commands

WebBrain accepts a small set of slash commands as the first thing on a line in the input box:

Command What it does
/allow-api Per-conversation API mutation override. By default WebBrain refuses to use API endpoints (POST/PUT/PATCH/DELETE via fetch_url or execute_js) for any action that creates, modifies, deletes, or sends — it always goes through the visible UI of the current page so you can see what's happening. Type /allow-api (optionally followed by a task description) to lift that restriction for the current conversation only. The agent will still prefer UI when UI works, but may fall back to API mutations when UI is genuinely failing or unworkable. A sticky badge appears above the input area while the override is active. The flag clears when you reset the conversation.

The default UI-first rule exists because API actions are invisible (you don't see what's being sent), often require separate auth tokens you may not have configured, and can have a much larger blast radius than a visible mis-click. Only use /allow-api when you've decided you want that tradeoff for a specific job.

Known Issues

  • Firefox is meaningfully weaker than Chrome. Firefox has no equivalent to Chrome DevTools Protocol via chrome.debugger, so several Chrome-only features are missing in the Firefox build:
    • Click/type goes through the content-script path (document.querySelector + el.click()) instead of CDP Input.dispatchMouseEvent. This means no shadow-DOM piercing, no real trusted mouse events (some React/Vue handlers won't fire), no closed-shadow-root traversal, and no resolveSelector retry budget.
    • No SPA-navigation-aware retry extension.
    • No conversation persistence across background restarts.
    • No CDP screenshots. Auto-screenshot uses tabs.captureVisibleTab instead, which works for active tabs only and at slightly lower quality.
    • No closed shadow root support for read/extract tools.
    • Site adapters, vision detection, loop detection, and the auto-screenshot loop are mirrored to Firefox.
  • SPA navigation detection in Firefox. Some single-page applications may not trigger content-script re-injection after client-side navigation.
  • Firefox temporary add-on — Firefox requires the extension to be loaded as a temporary add-on during development, which is removed on restart.

Roadmap

  • Conversation export/import — Save and load chat histories
  • Custom tool definitions — User-defined tools via settings
  • Keyboard shortcuts — Hotkeys for opening panel, sending messages, switching modes
  • Context menu integration — Right-click → "Ask WebBrain about this"
  • Screenshot/vision tool — Send screenshots to multimodal models for visual understanding
  • Chrome Web Store / Firefox AMO — Official store listings

Adding a New Provider

  1. Create a new class extending BaseLLMProvider in src/providers/
  2. Implement chat() and optionally chatStream()
  3. Register it in src/providers/manager.js

All providers normalize to a common response format:

{ content: string, toolCalls: Array|null, usage: Object|null }

License

MIT — built by Emre Sokullu

Reviews (0)

No results found