baremobile
Health Warn
- License — License: Apache-2.0
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Low visibility — Only 8 GitHub stars
Code Fail
- exec() — Shell command execution in ios/aria-kba-old/ble-hid.test.js
- exec() — Shell command execution in ios/aria-kba-old/check-prerequisites.js
- exec() — Shell command execution in ios/aria-kba-old/integration.test.js
- process.env — Environment variable access in ios/aria-kba-old/integration.test.js
- process.env — Environment variable access in ios/aria-kba-old/ios-ble.js
- exec() — Shell command execution in ios/aria-kba-old/ios-live-test.js
- process.env — Environment variable access in ios/aria-kba-old/ios-live-test.js
Permissions Pass
- Permissions — No dangerous permissions requested
This tool provides an MCP server and CLI that allows AI agents to fully control real Android and iOS mobile devices. It translates on-screen interfaces into clean, token-friendly text and supports automated actions like tapping, typing, and launching apps.
Security Assessment
The overall risk is High. The tool inherently requires dangerous device permissions to function correctly, executing commands to control ADB, Termux, and WebDriverAgent. The automated scan flagged multiple instances of dynamic shell command execution, primarily located within iOS integration and testing files. While no hardcoded secrets were found, the codebase reads system environment variables. Because this tool is designed to give an AI agent complete, low-level control over a mobile device—including the ability to send SMS, make calls, and interact with system settings—it should be strictly limited to dedicated test devices and never used on a personal primary phone.
Quality Assessment
The project is very new and currently has low community visibility with only 8 GitHub stars. However, it is actively maintained, with repository activity as recent as today. The code is legally safe to use and build upon, as it is properly covered under the permissive Apache-2.0 license.
Verdict
Use with caution—while it is actively maintained and properly licensed, developers should restrict its use to isolated test devices due to the tool's deep system access and high inherent security risks.
Gives agents Android + iOS devices. Screen in, pruned snapshot out. Replaces Appium, Espresso, XCUITest. Zero deps, zero wasted tokens.
┌─────────────┐
│ ■ Settings │
│ ─────────── │
│ ◉ Wi-Fi │
│ ◉ Bluetooth │
│ ▸ Display │
└─────────────┘
baremobile
AI agents control your phone like you do -- same device, same apps, same screen.
Prunes the accessibility tree down to what matters. Clean YAML, zero wasted tokens.
What this is
baremobile gives AI agents full control of real mobile devices -- read the screen, tap, type, swipe, launch apps, send SMS, take photos. The screen comes back as a pruned accessibility snapshot with [ref=N] markers; the agent picks a ref and acts on it.
No Appium. No Java server. No Espresso. Zero runtime dependencies. Same patterns as barebrowse -- agents learn one API for both web and mobile.
Android -- full screen control via ADB, plus on-device APIs (SMS, calls, GPS, camera) via Termux. Use it for QA, as a personal AI assistant, or for remote device management.
iOS -- same snapshot() → tap(ref) pattern via WebDriverAgent. Shared prune pipeline, identical YAML output. No Mac, no Xcode. Designed for QA (USB required on Linux).
| Platform | Mode | Where it runs | What it does | Requires |
|---|---|---|---|---|
| Android | Host ADB | Your computer | Screen control -- snapshots, tap/type/swipe, screenshots, app lifecycle | adb + USB or WiFi |
| Android | Termux ADB | On the phone | Same screen control, no host machine | Termux + wireless debugging |
| Android | Termux:API | On the phone | Device APIs -- SMS, calls, GPS, camera, clipboard, contacts | Termux + Termux:API app |
| iOS | WDA | Your computer | Screen control -- snapshots, tap/type/scroll, screenshots | USB + WDA on device |
Host ADB is the default. Termux modes run on the device itself -- useful for a phone that acts as its own autonomous agent. Termux ADB and Termux:API combine for screen control plus device APIs, all from the phone.
Quick start
Prerequisites: Node.js >= 22. Android needs adb in PATH (platform-tools). iOS needs Python 3.12 for setup (runtime is pure HTTP).
npm install baremobile
Three flavors: CLI, MCP server, or library import. Pick one.
CLI
npx baremobile open # start daemon
npx baremobile launch com.android.settings
npx baremobile snapshot # -> .baremobile/screen-*.yml
npx baremobile tap 4 # tap ref 4
npx baremobile close # shut down
Full command set: open, close, status, snapshot, screenshot, tap, tap-xy, tap-grid, type, press, scroll, swipe, long-press, launch, intent, back, home, wait-text, wait-state, grid, logcat.
MCP server
Claude Code:
claude mcp add baremobile -- npx baremobile mcp
Claude Desktop / Cursor -- add to config (claude_desktop_config.json, .cursor/mcp.json):
{
"mcpServers": {
"baremobile": {
"command": "npx",
"args": ["baremobile", "mcp"]
}
}
}
11 tools: snapshot, tap, type, press, scroll, swipe, long_press, launch, screenshot, back, find_by_text.
Library
import { connect } from 'baremobile';
const page = await connect(); // auto-detect device
const snapshot = await page.snapshot(); // pruned YAML with [ref=N] markers
await page.tap(5); // tap element
await page.type(3, 'hello'); // type into field
await page.scroll(1, 'down'); // scroll
await page.launch('com.android.chrome'); // open app
await page.back(); // navigate back
Works with any LLM orchestration library. Ships with an adapter for bareagent.
Full API, snapshot format, interaction patterns, and gotchas: baremobile.context.md.
What the agent sees
- ScrollView [ref=1]
- Group
- Text "Settings"
- Group [ref=2]
- Text "Search settings"
- ScrollView [ref=3]
- List
- Group [ref=4]
- Text "Network & internet"
- Text "Mobile, Wi-Fi, hotspot"
- Group [ref=5]
- Text "Connected devices"
- Text "Bluetooth, pairing"
Compact, token-efficient. Interactive elements get [ref=N] markers. The agent reads the snapshot, picks a ref, acts on it. Bloated accessibility trees get a 4-step pruning pass, 200+ widget classes mapped to semantic roles. Text input quirks, multi-device setups, element state tracking, and vision fallback are handled automatically.
Device setup
The interactive wizard handles everything -- adb install, SDK setup, device connection:
npx baremobile setup # Android: emulator, USB, WiFi, or Termux
Manual setup (USB):
- Enable Developer Options -- Settings > About phone > tap "Build number" 7 times
- Enable USB debugging -- Settings > System > Developer options > toggle on
- Connect device via USB, tap "Allow" on the prompt
- Verify --
adb devicesshould show your device
Android 10+ required (2019 or newer). For WiFi, Termux, emulator, and iOS setup details, see docs/customer-guide.md.
WiFi auto-reconnect: After WiFi setup, the device IP is saved. If the connection drops (DHCP reassignment, ADB restart), connect() automatically reconnects -- no manual re-setup needed.
Tested against
Settings, Messages, Chrome, Gmail, Files, Camera, Calculator, Contacts, Play Store, YouTube -- on physical devices and emulators across API 33-35.
The bare ecosystem
Four vanilla JS modules. Zero deps where possible (bareguard has one). Same API patterns.
| bareagent | barebrowse | baremobile | bareguard | |
|---|---|---|---|---|
| Does | Gives agents a think→act loop | Gives agents a real browser | Gives agents a mobile device | Gates everything an agent does |
| How | Goal in → coordinated actions out | URL in → pruned snapshot out | Screen in → pruned snapshot out | Action in → allow / deny / human-asked out |
| Replaces | LangChain, CrewAI, AutoGen | Playwright, Selenium, Puppeteer | Appium, Espresso, UIAutomator2 | Hand-rolled allowlists, scattered policy code |
| Interfaces | Library · CLI · subprocess | Library · CLI · MCP | Library · CLI · MCP | Library |
| Solo or together | Orchestrates the others as tools | Works standalone | Works standalone | Embedded in bareagent's loop; usable by any runner |
Reach 50+ messengers with one Docker container via beeperbox — a headless Beeper Desktop that exposes WhatsApp, iMessage, Signal, Telegram, Slack, Discord, RCS, SMS and more as a single MCP server. Wire it through bareagent's MCP bridge; bareguard policies the invocations like any other tool (per-chat allowlists, ask patterns on destructive sends, all the usual layered defense).
What you can build:
- Headless automation — scrape sites, fill forms, extract data, monitor pages on a schedule
- QA & testing — automated test suites for web and Android apps without heavyweight frameworks
- Personal AI assistants — chatbots that browse the web or control your phone on your behalf
- Remote device control — manage Android devices over WiFi, including on-device via Termux
- Agentic workflows — multi-step tasks where an AI plans, browses, and acts across web and mobile
Why this exists: Most automation stacks ship 200MB of opinions before you write a line of code. These don't. Install, import, go.
License
Apache-2.0 — see LICENSE.
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found