pwnkit
Health Pass
- License — License: NOASSERTION
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Community trust — 28 GitHub stars
Code Fail
- rm -rf — Recursive force deletion command in .github/workflows/binary-build.yml
Permissions Pass
- Permissions — No dangerous permissions requested
This is an autonomous penetration testing framework that uses AI agents to scan web applications, codebases, and packages for security vulnerabilities. It is designed to proactively attack systems so developers can find and fix flaws before malicious actors do.
Security Assessment
Overall Risk: Medium. The tool inherently requires the execution of shell commands and makes outbound network requests to perform its automated attacks. It also relies on external LLM APIs (such as OpenRouter or OpenAI) to function, meaning prompt data and target details are sent to third-party AI providers. No hardcoded secrets were detected, and no dangerous system permissions are requested during installation. However, because the tool is explicitly designed to aggressively attack targets, you must ensure it is strictly pointed only at environments you own and have authorized for testing. Accidental misconfiguration could easily lead to unauthorized breaches of your own infrastructure.
Quality Assessment
The project appears to be actively maintained with very recent repository activity. However, it currently has very low community visibility with only 9 GitHub stars. While the documentation mentions an Apache 2.0 license, the automated license check returns a "NOASSERTION" warning, which may require manual verification. Given its niche, highly specialized nature, the small community footprint means fewer peer reviews for potential bugs or safety oversights.
Verdict
Use with caution: ensure strict isolation and clear authorization for any target environments, keeping in mind that early-stage autonomous offensive tools carry inherent operational risks.
The research-backed autonomous pentesting engine for all software.
pwnkit
Let autonomous AI agents hack you before attackers do.
Fully autonomous agentic pentesting framework.
Docs · Website · Blog · Benchmark · Triage
Fully autonomous agentic pentesting for web apps, AI/LLM apps, package ecosystems, and source code.
This README is the fast path. The detailed command reference, configuration, architecture notes, recipes, and benchmark breakdowns live in the docs site.
Quick Start
Standalone binary (zero deps)
curl -fsSL https://raw.githubusercontent.com/PwnKit-Labs/pwnkit/main/install.sh | bash
Downloads a self-contained pwnkit binary (~74 MB) for your platform from the latest GitHub Release — no Node, no Bun, no npm, no node_modules. Installs to ~/.pwnkit/bin/pwnkit. Set PWNKIT_INSTALL_DIR=/usr/local/bin to change the location, PWNKIT_VERSION=vX.Y.Z to pin a version.
Binaries ship for linux-x64, linux-arm64, darwin-arm64, and windows-x64. The interactive Bun-based TUI is baked into the binary — no extra install step. Intel Mac users: install Bun and compile from source.
Docker
docker run --rm -e OPENROUTER_API_KEY=$KEY \
ghcr.io/pwnkit-labs/pwnkit:latest scan --target https://example.com
If you use Azure OpenAI instead, also pass AZURE_OPENAI_BASE_URL and AZURE_OPENAI_MODEL. For the Responses API, the Azure base URL should include /openai/v1.
The image ships with Node 20, Playwright/Chromium, and the standard pentest toolbox (sqlmap, nmap, nikto, gobuster, ffuf, hydra, john, …) preinstalled.
Once installed
# Scan an AI / LLM endpoint
pwnkit scan --target https://example.com/api/chat
# Pentest a web app
pwnkit scan --target https://example.com --mode web
# White-box scan with source code access
pwnkit scan --target https://example.com --repo ./source
# Audit a package
pwnkit audit lodash
# Review source code
pwnkit review ./my-app
# Import and verify kernel crash reports
pwnkit ingest ./kernel-crashes --verify --output json
# Hunt for kernel advisory variants with foxguard rules
pwnkit kernel variant-hunt --tree ./linux --rules ./foxguard/rules/kernel/dirty-frag-class
# Auto-detect — just give it a target
pwnkit https://example.com
The binary is named pwnkit when installed via install.sh and pwnkit-cli when installed via npm. Substitute whichever your install route used; everything else is identical. From v0.10.0 the npm package is a smart launcher that downloads the platform-specific binary on first run and caches it under ~/.pwnkit/cache/<version>/, so npx pwnkit-cli and npm i -g pwnkit-cli work without a separate install step.
What It Does
scantargets AI / LLM apps, web apps, REST / OpenAPI APIs, and MCP servers.auditinstalls and inspects packages acrossnpm,pypi,cargo, andociwith ecosystem-specific prep, static analysis, and AI review.reviewperforms deep source-code security review on a local repo or Git URL.kernel variant-huntruns foxguard-backed kernel advisory variant hunting and maps SARIF hits into normal pwnkit findings.ingestparses kernel crash reports and can validate them against reproducers, including a real QEMU kernel VM path that compiles and runs reproducers inside the guest when configured.h1ships read-only HackerOne hacker-API helpers — verify creds, list programs, and export a program's scope into the pwnkit scope-file format consumed bypwnkit scan --scope <path>.dashboard,history,findings, andtriageprovide local persistence and review workflows.- Internal tooling:
pnpm --filter @pwnkit/benchmark triage-dataturns benchmark runs and verified findings into labeled JSONL for triage-model training; the cloud-sink relay streams findings and final reports to an orchestrator whenPWNKIT_CLOUD_SINK+PWNKIT_CLOUD_SCAN_IDare set.
Why It’s Different
- Shell-first web pentesting. The agent uses
bash, writes scripts, and chains tools like a human pentester instead of being trapped in a small HTTP-tool DSL. - Blind verification. Findings are independently re-exploited before they are reported.
- Docs-backed benchmark transparency. The current benchmark details live in the docs and raw artifacts under
packages/benchmark/results.
Docs
- Getting Started
- Adversarial evals
- Commands
- Configuration
- Recipes
- Architecture
- Triage Pipeline
- Benchmark
Snapshot
XBOW 99.0% aggregate (103/104) · 97.9% gpt-5.4 cohort (93/95) · $5.20/flag. Cybench 90.0% (36/40). AI/LLM regression 10/10.
The benchmark page is the canonical surface — it separates the stable model-specific cohort from the rotation-volatile retained aggregate, lists the historical mixed publication line, and notes the remaining challenge-set mismatches.
GitHub Action
- uses: PwnKit-Labs/pwnkit@main
with:
mode: review
path: .
format: sarif
env:
OPENROUTER_API_KEY: ${{ secrets.OPENROUTER_API_KEY }}
Development
git clone https://github.com/PwnKit-Labs/pwnkit.git
cd pwnkit
pnpm install
pnpm lint
pnpm test
See CONTRIBUTING.md.
Part of PwnKit Labs
Open-source adversarial security for the agentic AI era. pwnkit is one piece of the open-source PwnKit Labs stack:
- pwnkit — AI agent pentester (detect)
- foxguard — Rust security scanner (prevent)
- opensoar — Python-native SOAR platform (respond)
License
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found