baremobile

mcp
Guvenlik Denetimi
Basarisiz
Health Uyari
  • License — License: Apache-2.0
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Low visibility — Only 8 GitHub stars
Code Basarisiz
  • exec() — Shell command execution in ios/aria-kba-old/ble-hid.test.js
  • exec() — Shell command execution in ios/aria-kba-old/check-prerequisites.js
  • exec() — Shell command execution in ios/aria-kba-old/integration.test.js
  • process.env — Environment variable access in ios/aria-kba-old/integration.test.js
  • process.env — Environment variable access in ios/aria-kba-old/ios-ble.js
  • exec() — Shell command execution in ios/aria-kba-old/ios-live-test.js
  • process.env — Environment variable access in ios/aria-kba-old/ios-live-test.js
Permissions Gecti
  • Permissions — No dangerous permissions requested
Purpose
This tool provides an MCP server and CLI that allows AI agents to fully control real Android and iOS mobile devices. It translates on-screen interfaces into clean, token-friendly text and supports automated actions like tapping, typing, and launching apps.

Security Assessment
The overall risk is High. The tool inherently requires dangerous device permissions to function correctly, executing commands to control ADB, Termux, and WebDriverAgent. The automated scan flagged multiple instances of dynamic shell command execution, primarily located within iOS integration and testing files. While no hardcoded secrets were found, the codebase reads system environment variables. Because this tool is designed to give an AI agent complete, low-level control over a mobile device—including the ability to send SMS, make calls, and interact with system settings—it should be strictly limited to dedicated test devices and never used on a personal primary phone.

Quality Assessment
The project is very new and currently has low community visibility with only 8 GitHub stars. However, it is actively maintained, with repository activity as recent as today. The code is legally safe to use and build upon, as it is properly covered under the permissive Apache-2.0 license.

Verdict
Use with caution—while it is actively maintained and properly licensed, developers should restrict its use to isolated test devices due to the tool's deep system access and high inherent security risks.
SUMMARY

Gives agents Android + iOS devices. Screen in, pruned snapshot out. Replaces Appium, Espresso, XCUITest. Zero deps, zero wasted tokens.

README.md
  ┌─────────────┐
  │ ■  Settings │
  │ ─────────── │
  │ ◉ Wi-Fi     │
  │ ◉ Bluetooth │
  │ ▸ Display   │
  └─────────────┘

  baremobile

AI agents control your phone like you do -- same device, same apps, same screen.
Prunes the accessibility tree down to what matters. Clean YAML, zero wasted tokens.


What this is

baremobile gives AI agents full control of real mobile devices -- read the screen, tap, type, swipe, launch apps, send SMS, take photos. The screen comes back as a pruned accessibility snapshot with [ref=N] markers; the agent picks a ref and acts on it.

No Appium. No Java server. No Espresso. Zero runtime dependencies. Same patterns as barebrowse -- agents learn one API for both web and mobile.

Android -- full screen control via ADB, plus on-device APIs (SMS, calls, GPS, camera) via Termux. Use it for QA, as a personal AI assistant, or for remote device management.

iOS -- same snapshot()tap(ref) pattern via WebDriverAgent. Shared prune pipeline, identical YAML output. No Mac, no Xcode. Designed for QA (USB required on Linux).

Platform Mode Where it runs What it does Requires
Android Host ADB Your computer Screen control -- snapshots, tap/type/swipe, screenshots, app lifecycle adb + USB or WiFi
Android Termux ADB On the phone Same screen control, no host machine Termux + wireless debugging
Android Termux:API On the phone Device APIs -- SMS, calls, GPS, camera, clipboard, contacts Termux + Termux:API app
iOS WDA Your computer Screen control -- snapshots, tap/type/scroll, screenshots USB + WDA on device

Host ADB is the default. Termux modes run on the device itself -- useful for a phone that acts as its own autonomous agent. Termux ADB and Termux:API combine for screen control plus device APIs, all from the phone.

Quick start

Prerequisites: Node.js >= 22. Android needs adb in PATH (platform-tools). iOS needs Python 3.12 for setup (runtime is pure HTTP).

npm install baremobile

Three flavors: CLI, MCP server, or library import. Pick one.

CLI

npx baremobile open                       # start daemon
npx baremobile launch com.android.settings
npx baremobile snapshot                   # -> .baremobile/screen-*.yml
npx baremobile tap 4                      # tap ref 4
npx baremobile close                      # shut down

Full command set: open, close, status, snapshot, screenshot, tap, tap-xy, tap-grid, type, press, scroll, swipe, long-press, launch, intent, back, home, wait-text, wait-state, grid, logcat.

MCP server

Claude Code:

claude mcp add baremobile -- npx baremobile mcp

Claude Desktop / Cursor -- add to config (claude_desktop_config.json, .cursor/mcp.json):

{
  "mcpServers": {
    "baremobile": {
      "command": "npx",
      "args": ["baremobile", "mcp"]
    }
  }
}

11 tools: snapshot, tap, type, press, scroll, swipe, long_press, launch, screenshot, back, find_by_text.

Library

import { connect } from 'baremobile';

const page = await connect();              // auto-detect device
const snapshot = await page.snapshot();     // pruned YAML with [ref=N] markers

await page.tap(5);                         // tap element
await page.type(3, 'hello');               // type into field
await page.scroll(1, 'down');              // scroll
await page.launch('com.android.chrome');   // open app
await page.back();                         // navigate back

Works with any LLM orchestration library. Ships with an adapter for bareagent.

Full API, snapshot format, interaction patterns, and gotchas: baremobile.context.md.

What the agent sees

- ScrollView [ref=1]
  - Group
    - Text "Settings"
    - Group [ref=2]
      - Text "Search settings"
  - ScrollView [ref=3]
    - List
      - Group [ref=4]
        - Text "Network & internet"
        - Text "Mobile, Wi-Fi, hotspot"
      - Group [ref=5]
        - Text "Connected devices"
        - Text "Bluetooth, pairing"

Compact, token-efficient. Interactive elements get [ref=N] markers. The agent reads the snapshot, picks a ref, acts on it. Bloated accessibility trees get a 4-step pruning pass, 200+ widget classes mapped to semantic roles. Text input quirks, multi-device setups, element state tracking, and vision fallback are handled automatically.

Device setup

The interactive wizard handles everything -- adb install, SDK setup, device connection:

npx baremobile setup   # Android: emulator, USB, WiFi, or Termux

Manual setup (USB):

  1. Enable Developer Options -- Settings > About phone > tap "Build number" 7 times
  2. Enable USB debugging -- Settings > System > Developer options > toggle on
  3. Connect device via USB, tap "Allow" on the prompt
  4. Verify -- adb devices should show your device

Android 10+ required (2019 or newer). For WiFi, Termux, emulator, and iOS setup details, see docs/customer-guide.md.

WiFi auto-reconnect: After WiFi setup, the device IP is saved. If the connection drops (DHCP reassignment, ADB restart), connect() automatically reconnects -- no manual re-setup needed.

Tested against

Settings, Messages, Chrome, Gmail, Files, Camera, Calculator, Contacts, Play Store, YouTube -- on physical devices and emulators across API 33-35.

The bare ecosystem

Four vanilla JS modules. Zero deps where possible (bareguard has one). Same API patterns.

bareagent barebrowse baremobile bareguard
Does Gives agents a think→act loop Gives agents a real browser Gives agents a mobile device Gates everything an agent does
How Goal in → coordinated actions out URL in → pruned snapshot out Screen in → pruned snapshot out Action in → allow / deny / human-asked out
Replaces LangChain, CrewAI, AutoGen Playwright, Selenium, Puppeteer Appium, Espresso, UIAutomator2 Hand-rolled allowlists, scattered policy code
Interfaces Library · CLI · subprocess Library · CLI · MCP Library · CLI · MCP Library
Solo or together Orchestrates the others as tools Works standalone Works standalone Embedded in bareagent's loop; usable by any runner

Reach 50+ messengers with one Docker container via beeperbox — a headless Beeper Desktop that exposes WhatsApp, iMessage, Signal, Telegram, Slack, Discord, RCS, SMS and more as a single MCP server. Wire it through bareagent's MCP bridge; bareguard policies the invocations like any other tool (per-chat allowlists, ask patterns on destructive sends, all the usual layered defense).

What you can build:

  • Headless automation — scrape sites, fill forms, extract data, monitor pages on a schedule
  • QA & testing — automated test suites for web and Android apps without heavyweight frameworks
  • Personal AI assistants — chatbots that browse the web or control your phone on your behalf
  • Remote device control — manage Android devices over WiFi, including on-device via Termux
  • Agentic workflows — multi-step tasks where an AI plans, browses, and acts across web and mobile

Why this exists: Most automation stacks ship 200MB of opinions before you write a line of code. These don't. Install, import, go.

License

Apache-2.0 — see LICENSE.

Yorumlar (0)

Sonuc bulunamadi