🍎 MacOS-MCP

macOS-MCP is a lightweight, open-source project that enables seamless integration between AI agents and the macOS operating system. Acting as an MCP server, it bridges the gap between LLMs and macOS, allowing agents to perform tasks such as file navigation, application control, UI interaction, browser automation, and more.

Supported Operating Systems

macOS 12 (Monterey)
macOS 13 (Ventura)
macOS 14 (Sonoma)
macOS 15 (Sequoia)

Key Features

Seamless macOS Integration
Interacts natively with macOS UI elements using the Accessibility API, opens apps, controls windows, simulates user input, and more.
Use Any LLM (Vision Optional)
Unlike many automation tools, macOS-MCP doesn't rely on traditional computer vision techniques or specific fine-tuned models. It works with any LLM, reducing complexity and setup time.
Rich Toolset for UI Automation
Includes tools for keyboard and mouse operations, capturing window/UI state, and extracting interactive elements from the accessibility tree.
Lightweight and Open-Source
Minimal dependencies and easy setup with full source code available under MIT license.
Customizable and Extendable
Easily adapt or extend tools to suit your unique automation or AI integration needs.
Support for Launchpad and System UI
Automatically detects when Launchpad is open and adjusts scanning behavior accordingly. Scans Control Center, Spotlight, and menu bar elements.

Installation

Prerequisites

Python 3.11+
UV (Package Manager) from Astral, install with pip install uv or curl -LsSf https://astral.sh/uv/install.sh | sh
Accessibility permissions granted to the terminal or application running the MCP server

Quick Start

Run the server directly with:

uvx macos-mcp

Grant Accessibility Permissions

macOS-MCP requires Accessibility permissions to interact with UI elements:

Open System Settings > Privacy & Security > Accessibility
Click the lock icon and authenticate
Add your terminal application (Terminal, iTerm2, VS Code, etc.)
Restart the terminal after granting permissions

Install in Claude Desktop

Install Claude Desktop
Add this to your claude_desktop_config.json:

{
  "mcpServers": {
    "macos-mcp": {
      "command": "uvx",
      "args": ["macos-mcp"]
    }
  }
}

Restart Claude Desktop

Install in Gemini CLI

Install Gemini CLI:

npm install -g @google/gemini-cli

Navigate to ~/.gemini and open settings.json
Add the macos-mcp config:

{
  "theme": "Default",
  "mcpServers": {
    "macos-mcp": {
      "command": "uvx",
      "args": ["macos-mcp"]
    }
  }
}

Restart Gemini CLI

MCP Tools

MCP Client can access the following tools to interact with macOS:

Tool	Description
`Click`	Click on the screen at the given coordinates. Supports left, right, and double-click.
`Type`	Type text at the current cursor position. Optionally clears existing text first.
`Scroll`	Scroll vertically or horizontally on the focused window or specific regions.
`Move`	Move mouse pointer or drag (set drag=True) to coordinates.
`Shortcut`	Press keyboard shortcuts (Cmd+C, Cmd+Tab, etc).
`Wait`	Pause execution for a defined duration.
`Snapshot`	Capture desktop state including active window, open applications, interactive elements with coordinates, and scrollable areas. Set `use_vision=True` to include annotated screenshots.
`App`	Launch an application, resize/move windows, or switch between apps. Supports bundle IDs and app names.
`Shell`	Execute shell commands or AppleScript. Use `mode='osascript'` for AppleScript execution.
`Scrape`	Extract and convert webpage content to Markdown format.

Architecture

macos-mcp/
├── src/
│   └── macos_mcp/
│       ├── __init__.py
│       ├── __main__.py          # MCP server entry point and tool definitions
│       ├── desktop/
│       │   ├── __init__.py
│       │   ├── service.py       # Desktop automation service
│       │   ├── views.py         # Data classes for desktop state
│       │   └── config.py        # Configuration constants
│       └── tree/
│           ├── __init__.py
│           ├── service.py       # Accessibility tree traversal
│           ├── views.py         # Data classes for tree elements
│           └── config.py        # Interactive roles and actions
├── pyproject.toml
└── README.md

How It Works

Accessibility Tree Traversal: Uses macOS Accessibility API (ApplicationServices) to traverse UI elements and extract interactive components.
Parallel Scanning: Scans multiple sources concurrently:
- Focused application window
- Dock
- Menu bar
- Control Center
- SystemUIServer
- Spotlight
- Desktop icons (when visible)
Smart Context Awareness:
- Detects Launchpad state and adjusts scanning
- Only shows desktop icons when no window is focused
- Filters out background services to show only user-facing apps
Screenshot Annotations: When use_vision=True, generates screenshots with numbered bounding boxes on interactive elements for visual reference.

Limitations

Requires Accessibility permissions to be granted manually
Some applications may have limited accessibility support
Performance may vary based on the complexity of the UI and number of open applications

Security

Important: macOS-MCP operates with accessibility access and can perform system-level operations. Please review the following before deployment:

Grant accessibility permissions only to trusted applications
Be cautious when using Shell tool with elevated commands
Review and understand the actions being performed by AI agents

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgements

macOS-MCP makes use of several excellent open-source projects and macOS frameworks:

PyObjC - Python to Objective-C bridge
Pillow - Python Imaging Library
macOS Accessibility API (ApplicationServices)

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Citation

@software{
  author       = {Jeomon},
  title        = {macOS-MCP: Lightweight open-source project for integrating LLM agents with macOS},
  year         = {2025},
  publisher    = {GitHub},
  url          = {https://github.com/Jeomon/macos-mcp}
}