winscript-mcp
Health Warn
- License — License: MIT
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Low visibility — Only 5 GitHub stars
Code Fail
- rm -rf — Recursive force deletion command in build-extension.sh
Permissions Pass
- Permissions — No dangerous permissions requested
This tool is an MCP server that provides AI agents with system-level desktop control on Windows 10/11. It wraps multiple native Windows automation APIs—like UI Automation, COM, and Win32—into a single interface, acting as a state-aware alternative to AppleScript.
Security Assessment
Overall risk: High. This server is explicitly designed to give AI agents deep, system-level control over a Windows desktop. While the automated code scan did not find dangerous hardcoded permissions, the core function of the software requires the execution of sensitive actions, shell interactions, and UI manipulation. Additionally, a recursive force deletion command (`rm -rf`) was flagged in the `build-extension.sh` script. Developers must be aware that connecting an AI agent to this server grants it extensive control over the host machine.
Quality Assessment
The project is actively maintained, with its most recent push happening today. It is distributed under the standard, permissive MIT license. However, community trust and visibility are currently very low. The repository has only 5 GitHub stars, indicating that the codebase has not been widely reviewed or battle-tested by a large audience.
Verdict
Use with caution: While actively maintained and licensed, granting any AI agent system-level desktop control via a low-visibility, unaudited project carries significant inherent risks and should be restricted to isolated testing environments.
A Windows-native automation API, packaged as an MCP server, that gives AI agents the same system-level desktop control that AppleScript gives on macOS.
██╗ ██╗██╗███╗ ██╗███████╗ ██████╗██████╗ ██╗██████╗ ████████╗
██║ ██║██║████╗ ██║██╔════╝██╔════╝██╔══██╗██║██╔══██╗╚══██╔══╝
██║ █╗ ██║██║██╔██╗ ██║███████╗██║ ██████╔╝██║██████╔╝ ██║
██║███╗██║██║██║╚██╗██║╚════██║██║ ██╔══██╗██║██╔═══╝ ██║
╚███╔███╔╝██║██║ ╚████║███████║╚██████╗██║ ██║██║██║ ██║
╚══╝╚══╝ ╚═╝╚═╝ ╚═══╝╚══════╝ ╚═════╝╚═╝ ╚═╝╚═╝╚═╝ ╚═╝
AppleScript for Windows. Built for AI agents.
Windows 10/11 · Python 3.10+ · MCP Protocol
macOS has AppleScript.
Windows had nothing clean for AI agents.
Until now.
WinScript is a state-aware, replayable, audited Windows automation server for AI agents. It wraps 4 fragmented Windows automation primitives — UI Automation, COM, Win32, and OCR — into a single MCP server that any agent can call.
Not a wrapper. Not a toy. Infrastructure.
Quick Start — Get WinScript Running in Claude Desktop
Option 1: Claude Desktop Extension (Easiest — Coming Soon)
Once approved in Claude's Extensions directory:
- Open Claude Desktop
- Go to Settings → Extensions
- Search for "WinScript"
- Click Install
- 59 tools appear — done!
Until then: Use Option 2 or 3 below.
Option 2: One-Click Installer
Step 1: Download this repo and double-click install.bat:
git clone https://github.com/RavaniRoshan/winscript-mcp.git
cd winscript-mcp
Then double-click install.bat (or run python install.py)
Step 2: Restart Claude Desktop
Step 3: WinScript appears in Claude's Extensions panel with 59 tools.
Option 3: PyPI (One Command)
pip install winscript
winscript
Then configure Claude Desktop manually (see below).
Option 4: Docker (Isolated)
docker run -v %USERPROFILE%/.winscript:~/.winscript ghcr.io/roshandamm/winscript-mcp:latest
Option 5: Direct from Source
git clone https://github.com/RavaniRoshan/winscript-mcp.git
cd winscript-mcp
pip install -r requirements.txt
python winscript-server.py
All options start an MCP server. The Claude Desktop Extension (Option 1) will be the easiest once approved.
The difference
Every other Windows automation tool gives you actions.
WinScript gives you actions + state.
# What others give you:
click("Submit")
→ "Clicked Submit"
# What WinScript gives you:
click("Submit")
→ "Clicked 'Submit' via uia_name [confidence 1.0] |
Active window: 'Form' → 'Confirmation' |
New windows: ['Success Dialog'] |
Duration: 312ms"
You don't just know what you did. You know what changed.
Detailed Installation
Option 1: Install from PyPI
pip install winscript
Then run: winscript or python -m winscript.server
Option 2: Run with Docker
# Pull and run
docker run -d --name winscript \
-v %USERPROFILE%/.winscript:~/.winscript \
ghcr.io/roshandamm/winscript-mcp:latest
# Or build locally
docker build -t winscript:latest .
docker run -d --name winscript -v %USERPROFILE%/.winscript:~/.winscript winscript:latest
Option 3: Run from Source (No Install)
git clone https://github.com/roshandamm/winscript-mcp.git
cd winscript-mcp
pip install -r requirements.txt
python winscript-server.py
Optional: OCR Fallback (Layer 4)
For better element detection in broken UI trees:
# Install Tesseract: https://github.com/tesseract-ocr/tesseract
pip install pytesseract
How WinScript Appears in Claude Desktop
After running the installer and restarting Claude Desktop, WinScript appears in Claude's Extensions panel just like Desktop Commander:
┌─────────────────────────────────────────────┐
│ WinScript │
│ AppleScript for Windows. Built for AI │
│ agents. Control any Windows app from Claude │
│ Enabled │
│ │
│ Developed by Roshan Ravani │
│ │
│ Tools 59 │
│ open_app │
│ close_app │
│ click │
│ type_text │
│ excel_read_cell │
│ outlook_send_email │
│ take_screenshot │
│ +53 more │
│ │
│ Requirements │
│ All requirements met │
│ │
│ Details │
│ Version 0.1.0 │
│ License MIT │
│ Author Roshan Ravani │
└─────────────────────────────────────────────┘
Claude can now:
- Open and control any Windows app
- Click buttons and type in UIs
- Read/write Excel files via COM
- Send Outlook emails
- Take screenshots
- Manage files and folders
- Record and replay workflows
- And 50+ more actions
All through natural language — no human interaction needed.
Wire into Claude Desktop
The easy way: Run install.bat — it configures everything for you.
The manual way: Edit %APPDATA%\Claude\claude_desktop_config.json:
{
"mcpServers": {
"winscript": {
"command": "python",
"args": ["-m", "winscript.server"]
}
}
}
Restart Claude Desktop. 59 tools appear automatically.
Five things that make WinScript different
1. Five-layer selector fallback chain
Other tools fail when the UI tree is bad (Electron apps, UWP, legacy Win32).
WinScript tries 5 strategies before giving up.
Layer 1 → UIA by element name (fast, exact)
Layer 2 → UIA by automation_id (for apps that label controls)
Layer 3 → UIA fuzzy role match (partial name, control type)
Layer 4 → OCR scan + bounding box (when UI tree is broken)
Layer 5 → Raw coordinates (click("x=412,y=308"))
Every tool call tells you which layer succeeded:
"Clicked 'Login' in 'Slack' [via ocr, confidence 0.91]"
2. State diffing after every action
Before you act, WinScript snapshots the desktop.
After you act, it snapshots again and diffs.
# The agent knows what actually happened:
open_app("excel")
→ "Opened Excel | Active window: '' → 'Book1 - Excel' |
New windows: ['Microsoft Excel - Book1'] | Duration: 2140ms"
type_text("Notepad", "hello")
→ "Typed 5 chars | No window change detected | Duration: 89ms"
No more "did it work?" loops.
3. Workflow recorder and replay
Record any successful multi-step sequence. Replay it on demand.
No human-written macros. No brittle scripts.
# Record:
workflow_record_start("daily_report", "Opens report and emails it")
open_latest_file("C:/reports", "xlsx")
read_active_document()
send_email_with_content("[email protected]", "Daily Report", "clipboard")
workflow_record_stop()
→ "Workflow 'daily_report' saved: 3 steps"
# Replay any time:
workflow_replay("daily_report")
→ "Step 1 ✓ open_latest_file → Opened q1_2026.xlsx [2100ms]
Step 2 ✓ read_active_document → [clipboard content] [340ms]
Step 3 ✓ send_email_with_content → Email sent [890ms]"
# Preview before running:
workflow_replay("daily_report", dry_run=True)
4. Semantic intent layer
Five high-level intents so agents don't have to think in clicks.
open_latest_file("C:/reports", "xlsx") # Find + open newest xlsx
send_email_with_content("[email protected]", "Re", "clipboard") # Clipboard → email
find_in_folder("C:/docs", "invoice", "pdf") # Find matching files
read_active_document() # Select-all copy current doc
summarize_screen() # Screenshot → agent vision
5. Full audit log + local memory
Every action, input, output, state delta, selector layer, and failure logged to ~/.winscript/audit.db.
get_audit_log(10)
→ "[14:23:01] ✓ open_app({'name':'notepad'}) → Opened notepad [2100ms]
[14:23:03] ✓ type_text({'text':'hello'}) → Typed 5 chars [89ms]
[14:23:11] ✗ click({'element':'Submit'}) → ERROR: No element found [412ms]"
get_failure_report()
→ "click: 3/12 failures (25%) | avg 380ms
open_app: 0/8 failures (0%) | avg 2100ms"
And memory persists across sessions:
what_files_have_i_opened(5, "xlsx")
→ "C:/reports/q1_2026.xlsx — opened 4x | last: 14:23 08/04"
what_did_i_do(5)
→ "[14:23] open_app → Opened notepad
[14:22] excel_read_cell → 47230.5
[14:21] outlook_send_email → Email sent to [email protected]"
All 59 tools
App Control (4)| Tool | What it does |
|---|---|
open_app(name) |
Open any app by name or alias |
close_app(title_hint) |
Close by partial window title |
focus_app(title_hint) |
Bring to foreground |
get_running_apps() |
List all open windows + PIDs |
| Tool | What it does |
|---|---|
click(app_title, element_name) |
Click element — 5-layer fallback |
type_text(app_title, text) |
Type into focused element |
read_text(app_title, element_name) |
Read text from element |
press_key(key, app_title) |
Keyboard shortcuts |
get_ui_tree(app_title, depth) |
Discover all UI elements |
| Tool | What it does |
|---|---|
excel_read_cell(filepath, sheet, cell) |
Read one cell |
excel_write_cell(filepath, sheet, cell, value) |
Write one cell + save |
excel_read_range(filepath, sheet, start, end) |
Read range as CSV |
outlook_send_email(to, subject, body) |
Send email |
outlook_read_inbox(count) |
Read N recent emails |
read_file_text · write_file_text · list_dir · move_file · copy_file · delete_file · file_exists
| Tool | What it does |
|---|---|
take_screenshot(region) |
Base64 PNG — agent sees your screen |
get_active_window() |
Current focused window title |
get_clipboard() |
Read clipboard |
set_clipboard(text) |
Write clipboard |
Typed semantic APIs for specific apps. No more clicking blind.
# Excel
excel_open(filepath) · excel_save() · excel_close(save)
# Chrome
chrome_open(url) · chrome_navigate(url) · chrome_get_url()
chrome_get_title() · chrome_new_tab() · chrome_close_tab()
chrome_find_on_page(text)
# Notepad
notepad_open(filepath) · notepad_type(text)
notepad_save() · notepad_close(save)
# Explorer
explorer_open(path) · explorer_navigate(path)
# Outlook
outlook_open()
Workflow Recorder + Replay (6)
workflow_record_start(name, description)
workflow_record_stop()
workflow_record_discard()
workflow_replay(name, dry_run)
workflow_list()
workflow_delete(name)
Semantic Intents (5)
open_latest_file(folder, extension)
send_email_with_content(to, subject, content_source)
find_in_folder(folder, search_term, extension)
read_active_document()
summarize_screen()
Audit + Memory + State (10)
# Audit
get_audit_log(limit, tool_filter)
get_failure_report()
# Memory
what_windows_have_i_seen(limit)
what_files_have_i_opened(limit, extension)
what_did_i_do(limit)
# State
get_state_snapshot()
# Modes
set_execution_mode(mode) # "safe" | "standard"
get_execution_mode()
App aliases
open_app("notepad") # notepad.exe
open_app("chrome") # chrome.exe
open_app("firefox") # firefox.exe
open_app("edge") # msedge.exe
open_app("excel") # EXCEL.EXE
open_app("word") # WINWORD.EXE
open_app("outlook") # OUTLOOK.EXE
open_app("explorer") # explorer.exe
open_app("terminal") # wt.exe
open_app("vscode") # Code.exe
open_app("cursor") # Cursor.exe
Error handling
Tools return "ERROR: ..." strings to the agent on failure — never crash your agent.
After 5 consecutive identical failures on the same tool + args:WinScriptMaxRetriesError is raised. Hard stop. Change your args and try again.
get_failure_report()
# See which tools are failing and why before they hit the limit
Execution modes
set_execution_mode("safe")
# Read-only: screenshots, reads, audits only
# Blocks: write, delete, click, type, send email, open apps
set_execution_mode("standard")
# Full access (default)
Where recordings live
~/.winscript/
├── audit.db # every action ever taken
├── memory.db # windows, files, action history
└── workflows/
├── daily_report.json
└── your_workflow.json
Auto-purge: audit logs older than 30 days are deleted on startup.
Limitations (honest)
- Windows only. By design. This is not a bug.
- Elevated (admin) apps cannot be automated from a non-admin process.
- UWP + Electron apps have broken accessibility trees. WinScript falls back to OCR then coordinates — but complex UIs still sometimes fail.
- Requires Tesseract for OCR fallback (Layer 4). Without it, WinScript skips to Layer 5.
- COM automation (Excel, Outlook) requires those apps installed and licensed.
Built on
| Layer | Library |
|---|---|
| MCP server | FastMCP |
| UI automation | pywinauto + uiautomation |
| COM automation | pywin32 |
| OCR fallback | pytesseract + Tesseract |
| Screenshots | mss + Pillow |
| State + memory | SQLite |
License
MIT
WinScript — 59 tools. State-aware. Replayable. Audited. Memory-backed.
Built by Roshan Ravani
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found