OnionClaw

mcp
Guvenlik Denetimi
Uyari
Health Uyari
  • License — License: NOASSERTION
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Low visibility — Only 5 GitHub stars
Code Uyari
  • network request — Outbound network request in fetch.py
  • network request — Outbound network request in pipeline.py
  • network request — Outbound network request in sync_sicry.py
Permissions Gecti
  • Permissions — No dangerous permissions requested
Purpose
This MCP server and standalone tool provides AI agents with automated access to the Tor network and dark web .onion services. It is designed for OSINT, threat intelligence, and dark web data scraping.

Security Assessment
The overall risk is High. The tool explicitly makes outbound network requests (found in `fetch.py`, `pipeline.py`, and `sync_sicry.py`) to route traffic through Tor. While no hardcoded secrets or dangerous local shell permissions were detected, the tool's core function involves autonomous headless crawling and automated identity rotation. Because it handles highly sensitive dark web data, connects to untrusted remote networks, and grants an AI agent unrestricted anonymous web access, it introduces significant security and legal liability concerns.

Quality Assessment
The project was recently updated (last push was today) and includes a MIT license, though the automated scan flagged the license as "NOASSERTION." However, the repository suffers from extremely low community visibility, having only 5 GitHub stars. It relies on a highly niche stack, making it difficult to verify the safety or reliability of the underlying code. Community trust is currently minimal.

Verdict
Use with caution — the tool functions as intended but poses severe security, ethical, and legal risks due to its autonomous dark web access capabilities.
SUMMARY

Provide AI agents with full Tor network access and dark web data through a zero-config OpenClaw skill or standalone tool.

README.md

OnionClaw 🧅

License: MIT
CI
OpenClaw Skill

by JacobJandon

OnionClaw logo

OpenClaw skill + standalone tool — full Tor / dark web access for AI agents

OnionClaw gives AI agents full access to the Tor network and .onion hidden services. It runs as an OpenClaw skill (drop-in, zero config beyond a .env file) and also works standalone from any terminal.

Based on the SICRY engine — 18 dark web search engines, Robin OSINT pipeline, four LLM analysis modes.

# As an OpenClaw skill:
cp -r OnionClaw ~/.openclaw/skills/onionclaw
# → agent now has 7 dark web commands available in every session

# Standalone:
python3 check_tor.py        # verify Tor
python3 search.py --query "ransomware healthcare"
python3 pipeline.py --query "acme.com data leak" --mode corporate

⚠️ The Rabbit Hole

Autonomous agents paired with the Tor network will be one of the most dangerous automation stacks on the internet within the next five years. OnionClaw is living proof that the rabbit hole goes deeper than most people think.

This tool is built for legitimate OSINT, threat intelligence, and security research. But the same primitives — anonymous routing, bulk scraping, AI-driven synthesis, zero-attribution browsing, automated identity rotation — are precisely what make this combination genuinely dangerous in the wrong hands.

This is not a warning tucked in fine print. It is the whole point of writing it down openly.

What the stack enables — the full map

Use case What it looks like
Dark-web crawling Automated, headless spidering of .onion services at scale — forums, paste sites, markets, leak boards — with full identity rotation between every request. No human ever touches a keyboard.
Threat intelligence Continuous monitoring of ransomware group blogs, initial access broker ads, CVE exploit drops, and actor chatter long before it surfaces on clearnet feeds.
Marketplace monitoring Price tracking, stock alerts, vendor reputation scraping, and availability checks across darknet markets — the same logic a researcher uses to track fentanyl price trends is the same logic a supplier uses to undercut competitors.
Credential surveillance Watching paste boards, breach dumps, and forum leaks for specific email domains, API keys, SSH keys, or internal hostnames the moment they appear — at a scale no human analyst can match.
Deanonymisation research Cross-correlating .onion service metadata with clearnet traces, timing attacks, correlation of writing style and PGP keys — used both by law enforcement hunting criminals and by threat actors hunting journalists and dissidents.
Criminal automation Autonomous agents placing orders, posting ads, messaging vendors, managing mule accounts, draining wallets — an entire criminal operation running without a human ever in the loop.
Disinformation infrastructure Coordinated persona networks on hidden boards, fabricated document drops timed to bleed into legitimate OSINT pipelines, synthetic intelligence that reads real but originates from nowhere.
Zero-day brokerage Automated monitoring of exploit vendor channels, private CVE auction boards, and vulnerability markets — buy-side and sell-side intelligence gathered faster than any human analyst.

The ugly side

The 2026 internet is already at the edge of this. Within five years, AI agents that can:

  1. Browse anonymously through rotating Tor circuits with no persistent identity
  2. Understand context well enough to navigate dark web UIs, CAPTCHA logic, and forum culture without hardcoded selectors
  3. Act autonomously — search, buy, post, exfiltrate, rotate — in closed loops with no human confirmation step
  4. Self-orchestrate across dozens of simultaneous Tor identities on parallel threads

…represent a qualitative shift from human criminals using tools to autonomous criminal infrastructure operating at machine speed with no human in the loop. The bottleneck has always been human attention. Remove it and the scaling properties of dark web operations change completely.

OnionClaw demonstrates all four of those primitives working together today. The full pipeline.py step — query refinement → multi-engine search → result filtering → batch scrape → LLM synthesis → identity rotation — is a complete autonomous dark web intelligence loop. Remove the OSINT framing and it is equally a complete autonomous dark web operation loop. The code is the same either way.

Why this is written explicitly

Security tools that pretend the dual-use problem does not exist are more dangerous than ones that name it directly. If you are building on top of OnionClaw:

  • Know what you are building. The pipeline does not know if the query is "acme.com credential leak" for a pentest or "rival vendor SSH keys" for espionage.
  • Know your jurisdiction. Automated access to dark web content and .onion services may be illegal in your country regardless of intent or findings.
  • Tor is not legal protection. It is operational security. The two are different things with very different limits.
  • AI + Tor + autonomy is not a theoretical threat. It is a present capability. This repo is one of many signals that the tooling is ready.

OnionClaw is published for defensive research, red-team engagements, and threat intelligence work. The code does not know the difference between those uses and their inverse. You do. Build accordingly.


Contents

  1. ⚠️ The Rabbit Hole
  2. What OnionClaw does
  3. Requirements
  4. Install as OpenClaw skill
  5. Standalone install
  6. Configuration
  7. All seven commands
  8. Investigation flows
  9. Analysis modes
  10. Architecture
  11. Troubleshooting
  12. Credits

What OnionClaw does

Seven commands expose the complete Tor OSINT toolkit:

Command What it does
check_tor.py Verify Tor is active, show current exit IP
renew.py Rotate Tor circuit — new exit node, new identity
check_engines.py Ping all 18 dark web search engines, show latency
search.py Search up to 18 engines simultaneously, deduplicated results
fetch.py Fetch any .onion or clearnet URL through Tor
ask.py LLM OSINT analysis of scraped content (4 modes)
pipeline.py Full Robin pipeline: refine → search → filter → scrape → analyse

Requirements

  • Python 3.10+
  • Tor running locally (SOCKS proxy on 127.0.0.1:9050)
  • pip packages: requests[socks] beautifulsoup4 python-dotenv stem
  • LLM key (optional — only needed for ask.py and pipeline.py analysis step)

Install Tor

Linux (Debian/Ubuntu):

apt install tor && tor &

macOS:

brew install tor && tor &

With control port (needed for renew.py):

cat > /tmp/onionclaw_tor.conf << 'EOF'
SocksPort 9050
ControlPort 9051
CookieAuthentication 1
DataDirectory /tmp/tor_data
EOF
tor -f /tmp/onionclaw_tor.conf &

Then set TOR_DATA_DIR=/tmp/tor_data in .env.

Install Python packages

pip install requests[socks] beautifulsoup4 python-dotenv stem

Install as OpenClaw skill

  1. Clone or copy this repo into your OpenClaw skills directory:
# Option A — clone directly
git clone https://github.com/JacobJandon/OnionClaw ~/.openclaw/skills/onionclaw

# Option B — copy local folder
cp -r OnionClaw ~/.openclaw/skills/onionclaw
  1. Configure .env in the skill folder:
cp ~/.openclaw/skills/onionclaw/.env.example ~/.openclaw/skills/onionclaw/.env
nano ~/.openclaw/skills/onionclaw/.env   # add LLM key if desired
  1. Start a new OpenClaw session — the skill loads automatically on startup. OpenClaw includes onionclaw in the agent context whenever the user asks about dark web topics.

Verify OpenClaw can see the skill:

openclaw skills list
# → onionclaw  🧅  Search the Tor dark web...

OpenClaw trigger phrases:

  • "search the dark web for …"
  • "investigate this .onion site …"
  • "check if my data appeared on the dark web"
  • "find ransomware leaks related to …"
  • "fetch this .onion URL …"
  • "run a Tor OSINT investigation on …"

After install, start a new session — existing sessions will not pick up the new skill.


Standalone install

No OpenClaw required. Every script runs directly from a terminal:

git clone https://github.com/JacobJandon/OnionClaw
cd OnionClaw
pip install requests[socks] beautifulsoup4 python-dotenv stem
cp .env.example .env
# Edit .env — add LLM key if desired (optional for most commands)

Configuration

Copy .env.example to .env and fill in what you need:

# ── Tor ────────────────────────────────────────────────────────────────
TOR_SOCKS_HOST=127.0.0.1
TOR_SOCKS_PORT=9050
TOR_CONTROL_HOST=127.0.0.1
TOR_CONTROL_PORT=9051
# TOR_CONTROL_PASSWORD=your_password   # only if HashedControlPassword in torrc
# TOR_DATA_DIR=/tmp/tor_data           # DataDirectory path for cookie auth

# ── LLM (needed only for ask.py and pipeline.py analysis step) ──────────
LLM_PROVIDER=openai          # openai | anthropic | gemini | ollama | llamacpp
OPENAI_API_KEY=sk-...
# ANTHROPIC_API_KEY=sk-ant-...
# GEMINI_API_KEY=AIza...
# OLLAMA_MODEL=llama3.2       # local model — no API key needed

No LLM key? Set LLM_PROVIDER=ollama with a running Ollama instance for fully local inference. search.py, fetch.py, check_tor.py, renew.py, and check_engines.py work with no LLM key at all.


All seven commands

check_tor.py — verify Tor

python3 check_tor.py
✓ Tor active
  Exit IP  : 185.220.101.20
  Error    : None
{"tor_active": true, "exit_ip": "185.220.101.20", "error": null}

Run this before anything else. Exits with code 1 if Tor is not running or not accessible on port 9050.


renew.py — rotate identity

python3 renew.py
Rotating Tor circuit...
✓ Identity renewed — new Tor circuit established
{"success": true, "error": null}

Auth order: password env var → cookie from TOR_DATA_DIR → common system paths → null auth. Works out of the box with the recommended torrc above or with a standard system Tor install.


check_engines.py — engine health check

# Live ping (15–30 s)
python3 check_engines.py

# --cached N: reuse last result if less than N minutes old (skips the slow ping)
python3 check_engines.py --cached 10

# JSON output only
python3 check_engines.py --json

# Show version
python3 check_engines.py --version
ALIVE  9/12
──────────────────────────────────────────
  ✓  Ahmia-clearnet       670ms  ███
  ✓  Tor66                749ms  ███
  ✓  Ahmia               1139ms  █████
  ✓  OSS                 1203ms  ██████
  ...
DOWN   3/12
  ✗  Kaizer     timeout
  ✗  Anima      timeout
  ✗  FindTor    timeout

Run before a large search session. Use the alive engine names as arguments to --engines in search.py.


search.py — search dark web

# All 18 engines (default)
python3 search.py --query "ransomware healthcare leak"

# Specific engines (faster — use alive engines from check_engines output)
python3 search.py \
  --query "credential dump" \
  --engines Ahmia Tor66 Ahmia-clearnet OSS

# Limit result count
python3 search.py --query "bitcoin mixer" --max 30

Returns a deduplicated {title, url, engine} list across all engines.

Tip: Use short keyword queries (3–5 words). Dark web search indexes respond much better to keywords than natural-language sentences.


fetch.py — fetch any .onion page

# Hidden service
python3 fetch.py \
  --url "http://juhanurmihxlp77nkq76byazcldy2hlmovfu2epvl5ankdibsot4csyd.onion"

# With links extracted
python3 fetch.py --url "http://example.onion/page" --links

# JSON output only
python3 fetch.py --url "http://example.onion" --json

Returns: title, text content (up to 8000 chars), extracted links, HTTP status code. A status of 0 means the hidden service is unreachable or offline.


ask.py — LLM OSINT analysis

# Inline content
python3 ask.py \
  --query "LockBit ransomware" \
  --mode ransomware \
  --content "page text from scraped pages"

# From file
python3 ask.py \
  --query "acme.com" \
  --mode corporate \
  --file /tmp/scraped_pages.txt

# Pipe from fetch.py
python3 fetch.py --url "http://some.onion" --json | \
  python3 -c "import sys,json; print(json.load(sys.stdin)['text'])" | \
  python3 ask.py --query "investigate" --mode threat_intel

# Custom analysis focus
python3 ask.py \
  --query "mixer services" \
  --mode threat_intel \
  --content "..." \
  --custom "Focus on cryptocurrency wallet addresses only"

Modes: threat_intel · ransomware · personal_identity · corporate — see Analysis modes.


pipeline.py — full investigation (one command)

Runs the complete Robin OSINT pipeline automatically:

Step What happens
1 Verify Tor is active
2 Check which engines are alive
3 Refine query to ≤5 keywords (LLM)
4 Search all alive engines
5 Filter top 20 most relevant results (LLM)
6 Batch-scrape pages concurrently
7 OSINT analysis and report (LLM)
# Basic
python3 pipeline.py --query "hospital ransomware 2026"

# With analysis mode and output file
python3 pipeline.py \
  --query "acme.com credentials leak" \
  --mode corporate \
  --out report.md

# Full options
python3 pipeline.py \
  --query "QUERY" \
  --mode ransomware \
  --max 50 \
  --scrape 12 \
  --custom "Focus on ransom amounts and victim industries" \
  --out report.md

Steps 1–6 work fully without an LLM key. Only steps 3, 5, and 7 use the LLM — they fall back gracefully when no key is set, printing what was collected so far.

# --clear-cache: discard cached fetch results before this run
python3 pipeline.py --query "fresh data" --clear-cache

# --version: print OnionClaw version
python3 pipeline.py --version

sync_sicry.py — update bundled SICRY engine

OnionClaw bundles a copy of sicry.py from the upstream SICRY™ repo.
sync_sicry.py lets you pull the latest version (or a specific release tag) without a full git pull.

# Fetch latest from main branch
python3 sync_sicry.py

# Fetch a specific release tag
python3 sync_sicry.py --tag v1.2.0

# Preview what would happen without writing
python3 sync_sicry.py --dry-run

# Show version
python3 sync_sicry.py --version

After syncing, commit the updated file:

git add sicry.py && git commit -m "chore: sync sicry.py to SICRY v1.2.0"

Update noticespipeline.py (and sicry.check_update()) checks the
GitHub Releases API (/releases/latest). Only published formal releases
trigger a notice — plain git tags and pre-releases are silently ignored.


Investigation flows

Scenario 1: Check for leaked credentials

python3 check_tor.py
python3 pipeline.py \
  --query "acme.com email passwords" \
  --mode corporate \
  --scrape 10 \
  --out acme_leak_report.md

Scenario 2: Ransomware intelligence

python3 check_tor.py
python3 search.py \
  --query "LockBit healthcare 2026" \
  --engines Ahmia Tor66 Ahmia-clearnet \
  --max 40

# Fetch the most relevant URL
python3 fetch.py --url "http://..." > /tmp/page.json

# Analyse
python3 ask.py \
  --query "LockBit healthcare 2026" \
  --mode ransomware \
  --file /tmp/page.json

Scenario 3: Personal data exposure check

python3 pipeline.py \
  --query "[email protected] personal data" \
  --mode personal_identity \
  --scrape 8

Scenario 4: Manual step-by-step investigation

# 1. Verify Tor
python3 check_tor.py

# 2. Find alive engines
python3 check_engines.py

# 3. Search with alive engines
python3 search.py \
  --query "ransomware hospital 2026" \
  --engines Ahmia Tor66 OSS \
  --max 40

# 4. Fetch top pages
python3 fetch.py --url "http://..." --links
python3 fetch.py --url "http://..."

# 5. Analyse
python3 ask.py \
  --query "hospital ransomware" \
  --mode ransomware \
  --content "combined text from pages"

# 6. Rotate identity when done
python3 renew.py

Analysis modes

threat_intel (default)

General dark web OSINT. Extracts: IoCs, infrastructure details, actor mentions, next investigation steps.

Output sections: Input Query · Source Links · Investigation Artifacts · Key Insights · Recommended Next Steps

ransomware

Malware/RaaS intelligence. Extracts C2 domains, file hashes, MITRE ATT&CK TTPs, victim sectors, ransom amounts.

Output sections: Input Query · Source Links · Malware/Ransomware Indicators · Threat Actor Profile · Key Insights · Recommended Next Steps

personal_identity

PII and breach exposure. Surfaces SSNs, emails, passwords, passport data, breach sources, risk severity ratings.

Output sections: Input Query · Source Links · Exposed PII Artifacts · Breach/Marketplace Sources · Exposure Risk Assessment · Key Insights · Recommended Next Steps

corporate

Corporate threat intelligence. Detects leaked credentials, source code, internal documents, initial access broker activity.

Output sections: Input Query · Source Links · Leaked Corporate Artifacts · Threat Actor/Broker Activity · Business Impact Assessment · Key Insights · Recommended Next Steps


Troubleshooting

✗ Tor NOT active

# Is Tor running?
pgrep tor || tor &
# Is port 9050 listening?
ss -tlnp | grep 9050

renew.pysuccess: false

# Is control port open?
ss -tlnp | grep 9051

# Enable it in torrc:
echo "ControlPort 9051" >> /etc/tor/torrc
echo "CookieAuthentication 1" >> /etc/tor/torrc
systemctl restart tor

# Set correct DataDirectory in .env:
TOR_DATA_DIR=/var/lib/tor      # system Tor (Debian/Ubuntu)
TOR_DATA_DIR=/tmp/tor_data     # custom torrc DataDirectory

fetch.pystatus: 0
The hidden service is offline. .onion sites go down frequently — try a different URL from search.py results. Confirm Tor is healthy with check_tor.py.

search.py returns 0 results
Dark web indexes fluctuate. Run check_engines.py to find alive engines, then use --engines Ahmia Ahmia-clearnet as reliable fallbacks.

ask.py / pipeline.py LLM error
Set LLM_PROVIDER and an API key in .env. For no-key local operation: LLM_PROVIDER=ollama with a running Ollama instance. All five non-LLM scripts (check_tor, renew, check_engines, search, fetch) work without any key.

ERROR: sicry.py not found
sicry.py must be in the OnionClaw root (same folder as SKILL.md). It is included in this repo — do not delete or move it. If you cloned and it is missing, re-clone fresh.


Architecture

How OnionClaw’s pieces fit together — from a command-line invocation all the way to a .onion response.

Layer diagram

┌──────────────────────────────────────────────────────────────┐
│             User / OpenClaw Agent / CI script                   │
│  python3 pipeline.py --query …  |  pipeline.py --daemon-poll  │
└───────────────────────────┬───────────────────────────────┘
                           │
  Python wrappers          ▼
┌──────────────────────────────────────────────────────────────┐
│            OnionClaw command layer                              │
│                                                                  │
│  check_tor.py  renew.py  fetch.py  search.py  ask.py            │
│  check_engines.py  pipeline.py  sync_sicry.py                   │
│                                                                  │
│  All call into ↓ one file                                        │
└────────────────────────────┬──────────────────────────────────┘
                           │
                           ▼
┌──────────────────────────────────────────────────────────────┐
│              SICRY™  (sicry.py — bundled)                        │
│                                                                  │
│  check_tor()           search(query, engines, …)               │
│  renew_identity()      scrape_all(urls)                         │
│  fetch(url)            ask(content, mode, …)                   │
│  check_search_engines()  refine_query()  filter_results()       │
│                                                                  │
│  State: SQLite DB (watch jobs · engine stats · result cache)    │
└─────┬──────────────────────────┬──────────────────────────────┘
        │  SOCKS5              │  stem control
        │  127.0.0.1:9050      │  127.0.0.1:9051
        ▼                      ▼
┌───────────────────┐  ┌───────────────────┐
│  Tor  (tor / TorPool) │  │  Tor Control Port    │
│  SOCKS5 proxy         │◄─│  renew_identity()    │
└─────┬─────────────┘  └───────────────────┘
        │  onion routing (3 hops)
        ▼
┌──────────────────────────────────────────────────────────────┐
│                  Dark Web / Tor Network                         │
│                                                                  │
│   18 search engines (.onion)    Clearnet via Tor exit nodes      │
│   Ahmia · DuckDuckGo-Tor        Any HTTPS/HTTP URL               │
│   Tor66 · + 15 more…                                             │
└──────────────────────────────────────────────────────────────┘

pipeline.py flow

  --query / watch job
     │
     ▼
[1] check_tor()              → abort if not active
[2] check_search_engines()   → collect live engine list
[3] refine_query()           → LLM: natural language → ≤5 keywords
[4] search(engines=live)     → parallel queries over Tor
[5] filter_results()         → LLM: keep top 20 relevant
[6] scrape_all(best[:N])     → concurrent batch-fetch over Tor
[7] ask(content, mode=…)    → LLM OSINT report
     │
     ▼
  report  [→ --out file.md]  [→  --watch-check --output-dir]

TorPool mode (optional)

Set SICRY_POOL_SIZE=N (recommended 2–4) to run N independent Tor processes
for higher search throughput:

pipeline.py  ─→  TorPool
                  ├── tor[0]  :9050
                  ├── tor[1]  :9052
                  └── tor[N-1]:...

Update policy

pipeline.py --check-update calls sicry.check_update(), which queries the
GitHub Releases API (/releases/latest). Only published formal releases
trigger a notice — plain git tags and pre-releases are silently ignored.


File structure

OnionClaw/
├── SKILL.md              ← OpenClaw skill descriptor (YAML frontmatter)
├── sicry.py              ← SICRY engine (bundled — no separate install needed)
├── .env.example          ← Copy to .env and configure
├── README.md             ← This file
├── check_tor.py          ← Verify Tor / show exit IP
├── renew.py              ← Rotate Tor circuit
├── check_engines.py      ← Ping all 18 engines
├── search.py             ← Search dark web
├── fetch.py              ← Fetch .onion pages
├── ask.py                ← OSINT analysis via LLM
├── pipeline.py           ← Full Robin investigation pipeline
└── sync_sicry.py         ← Sync SICRY engine from upstream

Credits


License

MIT License — Copyright (c) 2026 JacobJandon

"OnionClaw" and the OnionClaw logo are owned by JacobJandon.

See LICENSE for full text.


⚠️ Use responsibly and lawfully. Built for OSINT, security research, and threat intelligence. Read The Rabbit Hole section before deploying autonomously.

Yorumlar (0)

Sonuc bulunamadi