hanzi-browse

mcp
SUMMARY

let any ai agent use the local browser

README.md
Hanzi Browse

Hanzi Browse

Give your AI agent a real browser.

One tool call. Entire task delegated. Your agent clicks, types, fills forms,

reads authenticated pages — in your real signed-in browser.

npm
Chrome Web Store
Discord
License

Works with

Claude Code  
Cursor  
Codex  
Gemini CLI  
VS Code  
Kiro  
Antigravity  
OpenCode


Watch demo


Two ways to use Hanzi

Use it now — give your agent a browser

Use it now

Build with it — embed browser automation in your product

Build with it


Get Started

npx hanzi-browse setup

One command does everything:

npx hanzi-browse setup
│
├── 1. Detect browsers ──── Chrome, Brave, Edge, Arc, Chromium
│
├── 2. Install extension ── Opens Chrome Web Store, waits for install
│
├── 3. Detect AI agents ─── Claude Code, Cursor, Codex, Windsurf,
│                           VS Code, Gemini CLI, Amp, Cline, Roo Code
│
├── 4. Configure MCP ────── Merges hanzi-browse into each agent's config
│
├── 5. Install skills ───── Copies browser skills into each agent
│
└── 6. Choose AI mode ───── Managed ($0.05/task) or BYOM (free forever)
  • Managed — we handle the AI. 20 free tasks/month, then $0.05/task. No API key needed.
  • BYOM — use your Claude Pro/Max subscription, GPT Plus, or any API key. Free forever, runs locally.

Examples

"Go to Gmail and unsubscribe from all marketing emails from the last week"
"Apply for the senior engineer position on careers.acme.com"
"Log into my bank and download last month's statement"
"Find AI engineer jobs on LinkedIn in San Francisco"

Skills

The setup wizard installs browser skills into your agent automatically. Skills teach your agent when and how to use the browser for specific workflows:

Skill Description
hanzi-browse Core skill — when and how to use browser automation
e2e-tester Test your app in a real browser, report bugs with screenshots
social-poster Draft per-platform posts, publish from your signed-in accounts
linkedin-prospector Find prospects, send personalized connection requests
a11y-auditor Run accessibility audits in a real browser
x-marketer Twitter/X marketing workflows

Open source — add your own.


Build with Hanzi Browse

Embed browser automation in your product. Your app calls the Hanzi API, a real browser executes the task, you get the result back.

  1. Get an API keysign in to your developer console, then create a key
  2. Pair a browser — create a pairing token, send your user a pairing link (/pair/{token}) — they click it and auto-pair
  3. Run a taskPOST /v1/tasks with a task and browser session ID
  4. Get the result — poll GET /v1/tasks/:id until complete, or use runTask() which blocks
import { HanziClient } from '@hanzi/browser-agent';

const client = new HanziClient({ apiKey: process.env.HANZI_API_KEY });

const { pairingToken } = await client.createPairingToken();
const sessions = await client.listSessions();

const result = await client.runTask({
  browserSessionId: sessions[0].id,
  task: 'Read the patient chart on the current page',
});
console.log(result.answer);

API reference · Dashboard · Sample integration


Tools

Tool Description
browser_start Run a task. Blocks until complete.
browser_message Send follow-up to an existing session.
browser_status Check progress.
browser_stop Stop a task.
browser_screenshot Capture current page as PNG.

Pricing

Managed BYOM
Price $0.05/task (20 free/month) Free forever
AI model We handle it (Gemini) Your own key
Data Processed on Hanzi servers Never leaves your machine
Billing Only completed tasks. Errors are free. N/A

Building a product? Contact us for volume pricing.


Development

Prerequisites: Node.js 18+, Docker Desktop (must be running before make fresh).

First time (local setup)

git clone https://github.com/hanzili/hanzi-browse
cd hanzi-browse
make fresh

Performs full setup: installs deps, builds server/dashboard/extension, starts Postgres, runs migrations, and launches the dev server (~90s).

Run the project

make dev

Starts the backend services (Postgres + migrations + API server) and serves the dashboard UI.

Configuration

The defaults in .env.example are enough to run the server.

Optional services:

  • Google OAuth (dashboard sign-in) -- add GOOGLE_CLIENT_ID / GOOGLE_CLIENT_SECRET to .env
  • Stripe (credit purchases) -- add test keys to .env
  • Vertex AI (managed task execution) -- see .env.example for setup steps

Load the extension

Open chrome://extensions, enable Developer Mode, click "Load unpacked", select the dist/ folder.

Notes

  • Local vs CLI usage -- npx hanzi-browse setup is for packaged usage and may not work in a local clone
  • Port conflicts -- if you see EADDRINUSE on 3456, stop existing processes or run make stop

Commands

Command What it does
make fresh Full first-time setup (deps + build + DB + start)
make dev Start everything (DB + migrate + server)
make build Rebuild server + dashboard + extension
make stop Stop Postgres
make clean Stop + delete database volume
make check-prereqs Verify Node 18+ and Docker are available
make help Show all commands

Contributing

We welcome contributions! See CONTRIBUTING.md for setup instructions.

Good first contributions: new skills, landing pages, site-pattern files, platform testing, translations. Check the open issues.


Community

Discord · Documentation · Twitter


Privacy

Hanzi operates in different modes with different data handling. Read the privacy policy.

  • BYOM: No data sent to Hanzi servers. Screenshots go to your chosen AI provider only.
  • Managed / API: Task data processed on Hanzi servers via Google Vertex AI.

License

Polyform Noncommercial 1.0.0

Reviews (0)

No results found