Bridgic Browser

Bridgic Browser is a Python library for LLM-driven browser automation built on Playwright. It includes CLI tools, Python tools and skills for AI agents.

Features

Comprehensive CLI Tools - 67 tools organized into 15 categories; Designed to integrate with any AI agent
Python-based Tools - Used for agent / workflow code generation; Easier integration with Bridgic
Snapshot with Semantic Invariance - A representation of page snapshot based on accessibility tree and a specially designed ref-generation algorithm that ensures element refs remain unchanged across page reloads
Skills - Used for guided exploration and code generation; Compatible with most of coding agents
Stealth Mode (Enabled by Default) - Mode-aware anti-detection: 50+ Chrome args + JS patches in headless mode; minimal ~11 flags in headed mode to match real Chrome fingerprint
Dual Launch Mode - Automatically switches between isolated sessions and persistent contexts
Nested iframe Support - Supports DOM element operations within multi-level nested iframes

Installation

pip install bridgic-browser

After installation, install Playwright browsers:

playwright install chromium

Quick Start

CLI Tolls Usage

bridgic-browser open --headed https://example.com
bridgic-browser snapshot
# 'f0201d1c' is the ref value of the 'Learn more' link
bridgic-browser click f0201d1c
bridgic-browser screenshot page.png
bridgic-browser close

Python Tolls Integration

First, build tools:

from bridgic.browser.session import Browser
from bridgic.browser.tools import BrowserToolSetBuilder, ToolCategory

# create a browser instance
browser = Browser(headless=False)

async def create_tools(browser):
    # Build a focused tool set for your agent
    builder = BrowserToolSetBuilder.for_categories(
        browser,
        ToolCategory.NAVIGATION,
        ToolCategory.SNAPSHOT,
        ToolCategory.ELEMENT_INTERACTION,
        ToolCategory.CAPTURE,
        ToolCategory.WAIT,
    )
    tools = builder.build()["tool_specs"]
    return tools

Second (optional), build a Bridgic agent that uses this tool set:

import os
from bridgic.llms.openai import OpenAILlm, OpenAIConfiguration
async def create_llm():
    _api_key = os.environ.get("OPENAI_API_KEY")
    _model_name = os.environ.get("OPENAI_MODEL_NAME")

    llm = OpenAILlm(
        api_key=_api_key,
        configuration=OpenAIConfiguration(model=_model_name),
        timeout=60,
    )
    return llm

from bridgic.core.agentic.recent import ReCentAutoma, StopCondition
from bridgic.core.automa import RunningOptions
async def create_agent(llm, tools):
    browser_agent = ReCentAutoma(
        llm=llm,
        tools=tools,
        stop_condition=StopCondition(max_iteration=10, max_consecutive_no_tool_selected=1),
        running_options=RunningOptions(debug=True),
    )
    return browser_agent

async def main():
    tools = await create_tools(browser)
    llm = await create_llm()
    agent = await create_agent(llm, tools)
    result = await agent.arun(
        goal=(
            "Summarize the 'Learn more' page of example.com for me"
        ),
        guidance=(
            "Do the following steps one by one:\n"
            "1. Navigate to https://example.com\n"
            "2. Click the 'Learn more' link\n"
            "3. Take a screenshot of the 'Learn more' page\n"
            "4. Summarize the page content in one sentence and tell me how to access the screenshot.\n"
        ),
    )
    print("\n\n*** Final Result: ***\n\n")
    print(result)

    await browser.close()

if __name__ == "__main__":
    import asyncio
    asyncio.run(main())

How to Install Skills?

The skills of this repo can work with most of coding agents / AI assistant, such as Claude Code, Cursor, OpenClaw...
Install using the npx skills CLI:

# From this repository checkout
npx skills add . --skill bridgic-browser

# Or from GitHub
npx skills add bitsky-tech/bridgic-browser --skill bridgic-browser

After installation, the Skill will appear in your project’s agent directories (for example, Claude Code typically under .claude/skills/bridgic-browser/, and Cursor under .agents/skills/bridgic-browser/).

Browser API Usage

You can also directly call the underlying Browser API to control the browser.

from bridgic.browser.session import Browser

browser = Browser(headless=False)

async def main():
    await browser.navigate_to("https://example.com")
    snapshot = await browser.get_snapshot()
    print(snapshot.tree)  # Tree format: - role "name" [ref=f0201d1c]
    for ref, data in snapshot.refs.items():
        if data.name == "Learn more":
            learn_more_ref = ref
            break
    print(f"Found ref for 'Learn more': {learn_more_ref}")
    await browser.click_element_by_ref(learn_more_ref)
    await browser.take_screenshot(filename="page.png")
    await browser.close()

if __name__ == "__main__":
    import asyncio
    asyncio.run(main())

CLI Tools

bridgic-browser ships with a command-line interface for controlling a browser from the terminal (67 tools organized into 15 categories). A persistent daemon process holds a browser instance; each CLI invocation connects over a Unix domain socket and exits immediately.

Configuration

Browser options are read at daemon startup from the following sources, in priority order (highest last wins):

Source	Example
Defaults	`headless=True`
`~/.bridgic/bridgic-browser.json`	User-level persistent config
`./bridgic-browser.json`	Project-local config (in cwd at daemon start)
Environment variables	See `skills/bridgic-browser/references/env-vars.md`

Headed browser note:
When headless=false and stealth is enabled, bridgic auto-switches to system Chrome
(if installed) for better anti-detection (Chrome for Testing is blocked by Google OAuth).
To override, set:

channel: e.g. ”chrome”, ”msedge”
executable_path: absolute path to a browser binary

The JSON sources accept any Browser constructor parameter:

{
  "headless": false,
  "proxy": {"server": "http://proxy:8080", "username": "u", "password": "p"},
  "viewport": {"width": 1280, "height": 720},
  "locale": "zh-CN",
  "timezone_id": "Asia/Shanghai"
}

# One-shot env override
BRIDGIC_BROWSER_JSON='{"headless":false,"locale":"zh-CN"}' bridgic-browser open URL

Command List

Category	Commands
Navigation	`open`, `back`, `forward`, `reload`, `search`, `info`
Snapshot	`snapshot [-i] [-f\|-F] [-o N] [-l N]`
Element Interaction	`click`, `double-click`, `hover`, `focus`, `fill`, `select`, `options`, `check`, `uncheck`, `scroll-to`, `drag`, `upload`, `fill-form`
Keyboard	`press`, `type`, `key-down`, `key-up`
Mouse	`scroll`, `mouse-move`, `mouse-click`, `mouse-drag`, `mouse-down`, `mouse-up`
Wait	`wait [SECONDS] [TEXT] [--gone]`
Tabs	`tabs`, `new-tab`, `switch-tab`, `close-tab`
Evaluate	`eval`, `eval-on`
Capture	`screenshot`, `pdf`
Network	`network-start`, `network-stop`, `network`, `wait-network`
Dialog	`dialog-setup`, `dialog`, `dialog-remove`
Storage	`storage-save`, `storage-load`, `cookies-clear`, `cookies`, `cookie-set`
Verify	`verify-visible`, `verify-text`, `verify-value`, `verify-state`, `verify-url`, `verify-title`
Developer	`console-start`, `console-stop`, `console`, `trace-start`, `trace-stop`, `trace-chunk`, `video-start`, `video-stop`
Lifecycle	`close`, `resize`

Use -h or --help on any command for details:

bridgic-browser -h
bridgic-browser scroll -h

Python Tools

Bridgic Browser provides 67 tools organized into 15 categories. Use BrowserToolSetBuilder with category/name selection for scenario-focused tool sets.

Category-based Selection

from bridgic.browser.tools import BrowserToolSetBuilder, ToolCategory

# Focused set for your specific agent flows
builder = BrowserToolSetBuilder.for_categories(
    browser,
    ToolCategory.NAVIGATION,
    ToolCategory.ELEMENT_INTERACTION,
    ToolCategory.CAPTURE,
)
tools = builder.build()["tool_specs"]

# Include all available tools
builder = BrowserToolSetBuilder.for_categories(browser, ToolCategory.ALL)
tools = builder.build()["tool_specs"]

Name-based Selection (by function name)

# Select by tool function names
builder = BrowserToolSetBuilder.for_tool_names(
    browser,
    "search",
    "navigate_to",
    "click_element_by_ref",
)
tools = builder.build()["tool_specs"]

# Enable strict mode to catch typos and missing browser methods early
builder = BrowserToolSetBuilder.for_tool_names(
    browser,
    "search",
    "navigate_to",
    strict=True,
)
tools = builder.build()["tool_specs"]

Mixed Selection

builder1 = BrowserToolSetBuilder.for_categories(
    browser,
    ToolCategory.NAVIGATION,
    ToolCategory.ELEMENT_INTERACTION,
    ToolCategory.CAPTURE,
)
builder2 = BrowserToolSetBuilder.for_tool_names(
    browser, "verify_url", "verify_title"
)
tools = [*builder1.build()["tool_specs"], *builder2.build()["tool_specs"]]

Tool List

Navigation (6 tools):

navigate_to(url) - Navigate to URL
search(query, engine) - Search using search engine
get_current_page_info() - Get current page info (URL, title, etc.)
reload_page() - Reload current page
go_back() / go_forward() - Browser history navigation

Snapshot (1 tool):

get_snapshot_text(offset=0, limit=10000, interactive=False, full_page=True) - Get page state string for LLM (accessibility tree with refs). offset must be >= 0 and is used for pagination when the page is long: if the return value is truncated, a [notice] before the page content gives next_offset to call again. limit (default 10000) controls the maximum characters returned. interactive and full_page match get_snapshot (interactive-only or full-page by default).

Element Interaction (13 tools) - by ref:

click_element_by_ref(ref) - Click element
input_text_by_ref(ref, text) - Input text
fill_form(fields) - Fill multiple form fields
scroll_element_into_view_by_ref(ref) - Scroll element into view
select_dropdown_option_by_ref(ref, value) - Select dropdown option
get_dropdown_options_by_ref(ref) - Get dropdown options
check_checkbox_or_radio_by_ref(ref) / uncheck_checkbox_by_ref(ref) - Checkbox control
focus_element_by_ref(ref) - Focus element
hover_element_by_ref(ref) - Hover over element
double_click_element_by_ref(ref) - Double click
upload_file_by_ref(ref, path) - Upload file
drag_element_by_ref(start_ref, end_ref) - Drag and drop

Tabs (4 tools):

get_tabs() / new_tab(url) / switch_tab(page_id) / close_tab(page_id) - Tab management

Evaluate (2 tools):

evaluate_javascript(code) - Execute JavaScript
evaluate_javascript_on_ref(ref, code) - Execute JavaScript on element

Keyboard (4 tools):

type_text(text) - Type text character by character (key events, no ref — acts on focused element)
press_key(key) - Press keyboard shortcut (e.g. "Enter", "Control+A")
key_down(key) / key_up(key) - Key control

Mouse (6 tools) - Coordinate-based:

mouse_wheel(delta_x, delta_y) - Scroll wheel
mouse_click(x, y) - Click at position
mouse_move(x, y) - Move mouse
mouse_drag(start_x, start_y, end_x, end_y) - Drag operation
mouse_down() / mouse_up() - Mouse button control

Wait (1 tool):

wait_for(time_seconds, text, text_gone, selector, state, timeout) - Wait for conditions

Capture (2 tools):

take_screenshot(filename=None, ref=None, full_page=False, type="png") - Capture screenshot
save_pdf(filename) - Save page as PDF

Network (4 tools):

start_network_capture() / stop_network_capture() / get_network_requests() - Network monitoring
wait_for_network_idle() - Wait for network idle

Dialog (3 tools):

setup_dialog_handler(default_action) - Set up auto dialog handler
handle_dialog(accept, prompt_text) - Handle dialog
remove_dialog_handler() - Remove dialog handler

Storage (5 tools):

get_cookies() / set_cookie() / clear_cookies() - Cookie management (expires=0 is valid and preserved)
save_storage_state(filename) / restore_storage_state(filename) - Session persistence

Verify (6 tools):

verify_text_visible(text) - Check text visibility
verify_element_visible(role, accessible_name) - Check element visibility by role and accessible name
verify_url(pattern) / verify_title(pattern) - URL/title verification
verify_element_state(ref, state) - Check element state
verify_value(ref, value) - Check element value

Developer (8 tools):

start_console_capture() / stop_console_capture() / get_console_messages() - Console monitoring
start_tracing() / stop_tracing() / add_trace_chunk() - Performance tracing
start_video() / stop_video() - Video recording

Lifecycle (2 tools):

close() - Close browser
browser_resize(width, height) - Resize viewport

CLI Tools -> Python Tools Mapping

CLI command	SDK tool method
`open`	`navigate_to`
`search`	`search`
`info`	`get_current_page_info`
`reload`	`reload_page`
`back`	`go_back`
`forward`	`go_forward`
`snapshot`	`get_snapshot_text`
`click`	`click_element_by_ref`
`fill`	`input_text_by_ref`
`fill-form`	`fill_form`
`scroll-to`	`scroll_element_into_view_by_ref`
`select`	`select_dropdown_option_by_ref`
`options`	`get_dropdown_options_by_ref`
`check`	`check_checkbox_or_radio_by_ref`
`uncheck`	`uncheck_checkbox_by_ref`
`focus`	`focus_element_by_ref`
`hover`	`hover_element_by_ref`
`double-click`	`double_click_element_by_ref`
`upload`	`upload_file_by_ref`
`drag`	`drag_element_by_ref`
`tabs`	`get_tabs`
`new-tab`	`new_tab`
`switch-tab`	`switch_tab`
`close-tab`	`close_tab`
`eval`	`evaluate_javascript`
`eval-on`	`evaluate_javascript_on_ref`
`press`	`press_key`
`type`	`type_text`
`key-down`	`key_down`
`key-up`	`key_up`
`scroll`	`mouse_wheel`
`mouse-click`	`mouse_click`
`mouse-move`	`mouse_move`
`mouse-drag`	`mouse_drag`
`mouse-down`	`mouse_down`
`mouse-up`	`mouse_up`
`wait`	`wait_for`
`screenshot`	`take_screenshot`
`pdf`	`save_pdf`
`network-start`	`start_network_capture`
`network`	`get_network_requests`
`network-stop`	`stop_network_capture`
`wait-network`	`wait_for_network_idle`
`dialog-setup`	`setup_dialog_handler`
`dialog`	`handle_dialog`
`dialog-remove`	`remove_dialog_handler`
`cookies`	`get_cookies`
`cookie-set`	`set_cookie`
`cookies-clear`	`clear_cookies`
`storage-save`	`save_storage_state`
`storage-load`	`restore_storage_state`
`verify-text`	`verify_text_visible`
`verify-visible`	`verify_element_visible`
`verify-url`	`verify_url`
`verify-title`	`verify_title`
`verify-state`	`verify_element_state`
`verify-value`	`verify_value`
`console-start`	`start_console_capture`
`console`	`get_console_messages`
`console-stop`	`stop_console_capture`
`trace-start`	`start_tracing`
`trace-chunk`	`add_trace_chunk`
`trace-stop`	`stop_tracing`
`video-start`	`start_video`
`video-stop`	`stop_video`
`close`	`close`
`resize`	`browser_resize`

Core Components

Browser

The main class for browser automation with automatic launch mode selection:

from bridgic.browser.session import Browser

# Isolated session (no persistence)
browser = Browser(
    headless=True,
    viewport={"width": 1600, "height": 900},
)

# Persistent session (with user data)
browser = Browser(
    headless=False,
    user_data_dir="./user_data",
    stealth=True,  # Enabled by default
)

Key Parameters:

Parameter	Type	Default	Description
`headless`	bool	True	Run in headless mode
`viewport`	dict	1600x900	Browser viewport size
`user_data_dir`	str/Path	None	Path for persistent context
`stealth`	bool/StealthConfig	True	Stealth mode configuration
`channel`	str	None	Browser channel (chrome, msedge, etc.)
`proxy`	dict	None	Proxy settings
`downloads_path`	str/Path	None	Download directory

Snapshot: Use get_snapshot(interactive=False, full_page=True) to get an EnhancedSnapshot with .tree (accessibility tree string) and .refs (ref → locator data). By default full_page=True includes all elements regardless of viewport position. Pass interactive=True for clickable/editable elements only (flattened output), or full_page=False to limit to viewport-only elements. Use get_element_by_ref(ref) to get a Playwright Locator from a ref (e.g. "1f79fe5e") for click, fill, etc.

StealthConfig

Configure stealth mode for bypassing bot detection:

from bridgic.browser.session import StealthConfig, Browser

# Custom stealth configuration
config = StealthConfig(
    enabled=True,
    disable_security=False,
)

browser = Browser(stealth=config, headless=False)

DownloadManager

Handle file downloads with proper filename preservation:

# Pass downloads_path to Browser — it creates and manages the DownloadManager internally
browser = Browser(downloads_path="./downloads", headless=True)
await browser.navigate_to("https://example.com")  # lazy start triggers here

# Access downloaded files via the built-in manager
for file in browser.download_manager.downloaded_files:
    print(f"Downloaded: {file.file_name} ({file.file_size} bytes)")

Stealth Mode

Stealth mode is enabled by default and includes:

Headless mode: 50+ Chrome args + JS init script patching navigator.webdriver, window.chrome, WebGL, document.hasFocus(), visibilityState, and more. All patched functions spoof Function.prototype.toString to return [native code].
Headed mode: minimal ~11 flags only (matching real Chrome); JS patches are skipped entirely so third-party challenge iframes (e.g. Cloudflare Turnstile) see unmodified native APIs.

# Stealth is ON by default
browser = Browser()  # stealth=True

# Disable stealth if needed
browser = Browser(stealth=False)

# Custom stealth settings
from bridgic.browser.session import create_stealth_config

config = create_stealth_config(
    disable_security=True,
)
browser = Browser(stealth=config)

Error Model

SDK and CLI share one structured error protocol.

Base type: BridgicBrowserError
Stable fields: code, message, details, retryable
Behavior subclasses:
- InvalidInputError (invalid arguments/user input)
- StateError (invalid runtime state, e.g. no active page/session)
- OperationError (operation execution failures)
- VerificationError (assertion/verification failures)

Why keep a small number of behavior subclasses:

Lets callers catch by behavior when needed (e.g. retry only StateError)
Encodes default retry semantics close to the failure source
Avoids a large, hard-to-maintain class hierarchy while keeping error handling predictable

Daemon protocol is also structured:

Success: {"success": true, "result": "..."}
Failure: {"success": false, "error_code": "...", "result": "...", "data": {...}, "meta": {"retryable": false}}

CLI client converts daemon failures into BridgicBrowserCommandError, and CLI output keeps machine code visible as Error[CODE]: ....

Requirements

Python 3.10+
Playwright 1.57+
Pydantic 2.11+

License

MIT License

bridgic-browser