vuln-scout

skill
Guvenlik Denetimi
Gecti
Health Gecti
  • License — License: MIT
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Community trust — 14 GitHub stars
Code Gecti
  • Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Gecti
  • Permissions — No dangerous permissions requested
Purpose
This is an AI-powered plugin designed to integrate with Claude Code, acting as an autonomous white-box penetration tester. It analyzes local codebases to identify vulnerabilities, map data flows, and perform STRIDE threat modeling across multiple programming languages.

Security Assessment
The tool inherently requires deep read access to your local files to perform its security audits, meaning it will process sensitive data, proprietary code, and hardcoded secrets within your repositories. The automated scan pipeline executes local shell commands to run external CLI tools like Semgrep and Joern. The light code audit found no dangerous malicious patterns, backdoors, or hardcoded secrets within the tool's own 12 scanned files. No dangerous system permissions are requested. Overall risk rating: Medium. While the tool itself is safe, any security tool that reads proprietary codebase paths and executes system binaries always carries an inherent risk that requires user vetting.

Quality Assessment
The project is in active development, with its most recent push occurring today. It is properly licensed under the permissive and standard MIT license. Community trust is currently low but growing; it has 14 GitHub stars, which is typical for a new but promising open-source developer tool.

Verdict
Safe to use, provided you understand it will deeply scan your proprietary codebase and execute local security tools.
SUMMARY

AI-powered whitebox penetration testing plugin for Claude Code. 9 languages, 22 skills, 7 autonomous agents. STRIDE threat modeling, OWASP 2025 coverage, polyglot monorepo support.

README.md

VulnScout

VulnScout

AI-powered whitebox penetration testing for Claude Code.

One command. Full audit. Any codebase.

/whitebox-pentest:full-audit /path/to/code

VulnScout is a Claude Code plugin that turns Claude into an autonomous security reviewer. It brings battle-tested pentesting methodology (HTB Academy, OffSec AWAE/OSWE) into your terminal with STRIDE threat modeling, evidence-first findings, and support for 9 languages including Solidity smart contracts.

Tested end-to-end on OWASP Juice Shop v17.1.1 -- 62 findings across SQL injection, XSS, path traversal, SSTI, SSRF, hardcoded secrets, and more.

Why VulnScout?

Traditional SAST tools find patterns. VulnScout understands your application.

  • Automated scan pipeline -- Semgrep + Joern CPG + secret scanning in one command, with SARIF and Markdown output
  • Threat models first, then hunts -- STRIDE analysis identifies what matters before scanning
  • Traces data flow, not just patterns -- follows user input from source to sink across files and services
  • 15 CPG verification scripts -- proves exploitability through Code Property Graph analysis, not just pattern matching
  • CVSS 3.1 auto-scoring -- every finding gets a CVSS vector and numeric score
  • Handles massive codebases -- language-aware compression (Go: 97% reduction, Python: 90%) lets it audit million-token monorepos
  • Chains vulnerabilities -- finds SSRF-to-SSTI-to-RCE attack chains that single-pattern scanners miss
  • Polyglot-native -- audits Go + Python + TypeScript microservices as one interconnected system

Quick Start

As a Claude Code plugin

# Option 1: Symlink into your project's plugin directory
mkdir -p .claude/plugins
ln -s /path/to/vuln-scout/whitebox-pentest .claude/plugins/whitebox-pentest

# Option 2: Copy into your project
cp -r /path/to/vuln-scout/whitebox-pentest .claude/plugins/whitebox-pentest

# Run a full audit
/whitebox-pentest:full-audit .

# Or start with threat modeling
/whitebox-pentest:threats

Note: .claude/plugins/ is relative to your project root. Claude Code automatically discovers plugins in this directory.

As a Kuzushi task

VulnScout is available as a native task in Kuzushi, the AI security scanner. When installed as a dependency, Kuzushi loads the vuln-scout plugin into its task DAG alongside Semgrep, CodeQL, and 15+ other detection tasks.

# Run vuln-scout as part of a Kuzushi scan
npx kuzushi /path/to/repo --tasks vuln-scout

# Combine with other tasks
npx kuzushi /path/to/repo --tasks semgrep,vuln-scout,threat-hunt

# Use a specific model for vuln-scout
npx kuzushi /path/to/repo --tasks vuln-scout --task-model vuln-scout=anthropic:claude-opus-4-6

# Configure via .kuzushi/config.json
# { "tasks": ["semgrep", "vuln-scout"], "taskConfig": { "vuln-scout": { "model": "anthropic:claude-opus-4-6", "maxFindings": 30 } } }

Kuzushi handles triage, verification, PoC generation, and reporting on top of vuln-scout's findings.

Standalone Scan Pipeline

VulnScout includes Python scripts that run independently of Claude Code:

# Scan with Semgrep + secret scanning
python3 scripts/scan_orchestrator.py /path/to/code --tools semgrep --secrets --format sarif

# Create a Joern CPG (cached by content hash)
python3 scripts/create_cpg.py /path/to/code

# Batch-verify findings with Joern CPG analysis
python3 scripts/batch_verify.py --findings .claude/findings.json --cpg .joern/*.cpg

# Render HTML or Markdown from an existing findings artifact
python3 scripts/report.py .claude/findings.json --format html --output security-report.html

# CI gate: fail on high-severity findings
python3 scripts/scan_orchestrator.py . --tools semgrep --fail-on high --format sarif --output findings.sarif

# Validate and run prompt/skill eval suites
python3 whitebox-pentest/scripts/validate_evals.py
python3 whitebox-pentest/scripts/run_prompt_evals.py

What You Get

13 Commands

Command What it does
/whitebox-pentest:full-audit One command does everything -- scopes, threat models, audits, reports
/whitebox-pentest:threats STRIDE threat modeling with data flow diagrams
/whitebox-pentest:sinks Find dangerous functions across 9 languages
/whitebox-pentest:trace Follow data from source to sink
/whitebox-pentest:scan Run Semgrep, CodeQL, and Joern into a shared findings artifact
/whitebox-pentest:scope Handle large codebases with smart compression
/whitebox-pentest:propagate Found one bug? Find every instance of the pattern
/whitebox-pentest:verify CPG-based false positive elimination
/whitebox-pentest:report Render Markdown, JSON, SARIF, or HTML from the shared findings artifact
/whitebox-pentest:diff Compare security posture between git refs and highlight regressions
/whitebox-pentest:auto-fix Auto-remediate verified findings with generated patches
/whitebox-pentest:create-rule Generate a custom Semgrep rule from a confirmed vulnerability pattern
/whitebox-pentest:mutate Mutation-test security controls to find detection gaps

8 Autonomous Agents

Agents run independently and return detailed analysis:

  • app-mapper -- Maps architecture and trust boundaries
  • threat-modeler -- STRIDE analysis and data flow diagrams (consumes app-mapper output)
  • code-reviewer -- Proactive vulnerability identification
  • local-tester -- Dynamic testing guidance (hands off to poc-developer)
  • poc-developer -- Proof of concept development
  • patch-advisor -- Specific remediation with code patches
  • false-positive-verifier -- Evidence-based verification with NEEDS_REVIEW resolution path
  • attack-researcher -- Autonomous attack vector exploration beyond pattern matching

15 Joern CPG Verification Scripts

Each script proves or disproves a vulnerability through Code Property Graph data flow analysis:

Script What it verifies
verify-sqli SQL injection (parameterization, binding APIs)
verify-cmdi Command injection (shell vs array execution)
verify-xss Cross-site scripting (encoding, Content-Type)
verify-path Path traversal (strong vs weak normalization)
verify-ssrf Server-side request forgery (URL validation, allowlists)
verify-xxe XML external entity injection (entity disabling)
verify-ssti Server-side template injection (filesystem vs user templates)
verify-deser Unsafe deserialization (SafeLoader, ObjectInputFilter)
verify-ldap LDAP injection (filter escaping)
verify-randomness Insecure randomness (crypto alternatives)
verify-reentrancy Solidity reentrancy (CEI pattern)
verify-overflow Solidity integer overflow (SafeMath, Solidity >=0.8)
verify-access-control Solidity missing access control (onlyOwner, tx.origin)
verify-delegatecall Solidity delegatecall risks (proxy patterns, EIP-1967)
verify-generic Fallback for types without a dedicated script

27 Auto-Activated Skills

Skills activate automatically when relevant -- no configuration needed:

Core Analysis: dangerous-functions, vuln-patterns, data-flow-tracing, cpg-analysis, exploit-techniques

OWASP Mapping: security-misconfiguration, cryptographic-failures, logging-failures, exception-handling, sensitive-data-leakage, business-logic

Advanced: threat-modeling, vulnerability-chains, cross-component, cache-poisoning, postmessage-xss, sandbox-escapes, framework-patterns, nextjs-react

Infrastructure: workspace-discovery, mixed-language-monorepos, owasp-2025, secret-scanning

Extended Coverage: ai-ml-attacks, owasp-api-top10, cloud-native, compliance-mapping

Framework Security Patterns

Dedicated detection patterns for:

  • Django -- ORM bypass, template injection, CSRF exemptions, settings exposure
  • Rails -- Mass assignment, SQL interpolation, ERB injection, Marshal.load
  • Spring Security -- SpEL injection, CORS/CSRF misconfiguration, actuator exposure
  • GraphQL -- Introspection, depth/complexity limits, batching, nested auth bypass
  • Next.js/React -- Server Actions SSRF, middleware bypass, Server Component data exposure
  • Flask/Twig/Blade -- SSTI, filter callbacks, sandbox escapes

Supported Languages

Language Token Reduction Static Analysis CPG Verification
Go 95-97% fewer tokens Semgrep, Joern Yes
TypeScript/JS ~80% fewer tokens Semgrep, CodeQL Yes
Python 85-90% fewer tokens Semgrep, Joern Yes
Java 80-85% fewer tokens Semgrep, CodeQL Yes
PHP 80-85% fewer tokens Semgrep Yes
Ruby 85-90% fewer tokens Semgrep Yes
Rust 85-90% fewer tokens Semgrep --
C#/.NET 80-85% fewer tokens Semgrep, CodeQL --
Solidity 70-80% fewer tokens Semgrep, Slither Yes (4 scripts)

OWASP Top 10 Mapping

# OWASP Top 10 Coverage Primary Skills
A01 Broken Access Control Covered business-logic, owasp-api-top10
A02 Cryptographic Failures Covered cryptographic-failures
A03 Injection Covered vuln-patterns, dangerous-functions
A04 Insecure Design Covered business-logic, threat-modeling
A05 Security Misconfiguration Covered security-misconfiguration, cloud-native
A06 Vulnerable and Outdated Components Out of scope --
A07 Identification and Authentication Failures Covered vuln-patterns
A08 Software and Data Integrity Failures Covered vuln-patterns, ai-ml-attacks
A09 Security Logging and Monitoring Failures Covered logging-failures, sensitive-data-leakage
A10 Server-Side Request Forgery Covered vuln-patterns, framework-patterns, cloud-native

9/10 categories covered. A06 is intentionally out of scope -- VulnScout focuses on source review and exploitability, not dependency inventory.

Findings Artifact and CI Workflow

/scan, /verify, and /full-audit share one contract: .claude/findings.json.

  • schema_version identifies the artifact version.
  • kind separates reportable finding entries from audit hotspot pivots.
  • stable_key gives each entry a suppression-safe identifier.
  • source_tool and evidence are required on every entry.
  • cvss_vector and cvss_score provide CVSS 3.1 scoring.
  • Severity summaries count only unsuppressed entries where kind == "finding".

Prompt-first orchestration adds two persisted companion artifacts:

  • .claude/audit-plan.md captures module priority, attack surfaces, and verification strategy before deep-dive auditing.
  • .claude/review-ledger.json records adversarial review rounds for audit plans, threat models, and finding verification.

CI-focused flags:

--since-commit <sha>     # Scope to recent changes
--suppressions <path>    # Apply .vuln-scout-ignore suppressions
--fail-on <severity>     # Exit code 2 when blocking findings remain
--format sarif|json|md   # Machine-readable or human-readable output
--secrets                # Enable gitleaks/trufflehog secret scanning

How It Works

/full-audit automatically:

1. Measures codebase    -->  Too big? Compresses with language-aware strategy
2. Detects frameworks   -->  Next.js, Flask, Spring, Rails, Django, Solidity...
3. Threat models        -->  STRIDE analysis, DFDs, trust boundaries
4. Plans the audit      -->  Writes `.claude/audit-plan.md` before deep dives
5. Adversarial review   -->  Writes `.claude/review-ledger.json` for threat/finding review loops
6. Scans (Semgrep)      -->  Pattern matching + taint analysis
7. Verifies (Joern)     -->  CPG data flow proof per finding
8. Chains findings      -->  Connects SSRF + SSTI + RCE across services
9. Reports              -->  Markdown, JSON, or SARIF with CVSS scores

Polyglot Monorepos

Got a Go gateway, Python ML service, and TypeScript frontend? VulnScout handles it:

/whitebox-pentest:full-audit ~/code/platform

Polyglot detected: Go (450 files) + Python (380) + TypeScript (420)

Findings by Service:
  auth-service (Go):        2 CRITICAL, 1 HIGH
  api-gateway (Go):         1 HIGH, 2 MEDIUM
  ml-pipeline (Python):     1 CRITICAL, 2 HIGH
  web-frontend (TypeScript): 3 MEDIUM

Cross-Service Findings:
  Auth token not validated in ml-pipeline (CRITICAL)
  Error messages leak from Python to Gateway (MEDIUM)

Vulnerability Coverage

  • Injection: SQL, Command, LDAP, Template (SSTI), XXE
  • Authentication: Bypass, Session attacks, JWT flaws
  • Access Control: IDOR, Privilege escalation, BOLA (API)
  • Business Logic: Workflow bypass, state manipulation, trust boundary violations
  • Cryptography: Weak algorithms, hardcoded secrets, insecure randomness
  • Deserialization: Java, PHP, Python, .NET, ML pipeline (joblib/torch.load)
  • API Security: GraphQL depth attacks, mass assignment, gRPC reflection
  • Cloud Native: IMDS endpoints, S3 misconfiguration, K8s RBAC, serverless env leakage
  • Race Conditions: TOCTOU, double-spend attacks
  • Data Leakage: Credentials in logs, error exposure, secret scanning (git history)
  • Smart Contracts: Reentrancy, integer overflow, access control, delegatecall, flash loans
  • Compliance: PCI-DSS, HIPAA, SOC 2, NIST CSF mapping

Prerequisites

Required:

npm install -g repomix    # Codebase compression for large repos

Recommended (enhances scanning):

pip install semgrep       # Pattern matching + taint analysis

Optional (deepens analysis):

# Joern CPG analysis (data flow verification)
curl -L "https://github.com/joernio/joern/releases/latest/download/joern-install.sh" | bash

# Secret scanning (git history)
brew install gitleaks     # or: pip install trufflehog

# Solidity analysis
pip install slither-analyzer

Methodology

VulnScout implements methodologies from:

  • HTB Academy -- Whitebox Pentesting Process (4-phase)
  • OffSec AWAE -- Advanced Web Attacks and Exploitation (WEB-300)
  • NahamSec -- Deep application understanding and business logic focus

"Understanding the application deeply will always beat automation."

The plugin supports two complementary approaches:

  1. Sink-First -- Find dangerous functions, trace data flow backward
  2. Understanding-First -- Map the application, then hunt with context

Both work together. Understanding reveals business logic bugs that sink scanning misses.

Diff-Aware Scanning

Scope audits to recent changes or PR diffs for fast CI feedback:

# Scan only files changed since a known base
/whitebox-pentest:full-audit . --since-commit origin/main

# Prioritize modules with recent changes
/whitebox-pentest:full-audit . --recent 7

# Headless PR gate: diff scan, JSON output, no prompts
/whitebox-pentest:full-audit . --since-commit origin/main --quick --json --no-interactive

# Incremental Semgrep scan of changed files
/whitebox-pentest:scan . --since-commit origin/main --format sarif --fail-on high

--diff-base remains as a backward-compatible alias for older automation.

Dynamic Verification

Optionally execute generated PoC scripts to confirm exploitability:

# Audit with dynamic PoC verification (requires explicit approval per PoC)
/whitebox-pentest:full-audit . --verify-dynamic

Safety-first: PoCs run in --dry-run mode by default, require user confirmation, have a 30s timeout, and must include cleanup functions.

Project Structure

whitebox-pentest/
  .claude-plugin/plugin.json   # Plugin manifest
  agents/                       # 8 autonomous security analysts
  commands/                     # 13 slash commands
  hooks/                        # 4 background automation hooks
  skills/                       # 27 auto-activated knowledge modules
  evals/                        # Prompt/skill trigger and workflow eval definitions
  scripts/
    scan_orchestrator.py        # Main scan pipeline
    run_semgrep.py              # Semgrep wrapper + normalizer
    run_secrets.py              # Secret scanner (gitleaks/trufflehog)
    create_cpg.py               # Joern CPG creation + caching
    batch_verify.py             # Batch CPG verification
    bundle_joern.py             # Script bundler for Joern compatibility
    markdown_report.py          # Report generator
    artifact_utils.py           # Findings schema, SARIF, CVSS, dedup
    prompt_artifacts.py         # Audit plan, review ledger, and state validators
    validate_evals.py           # Prompt eval definition validator
    run_prompt_evals.py         # Prompt/skill benchmark runner
    tool_runners/               # Modular tool runner package
    joern/                      # 15 CPG verification scripts

License

MIT

Yorumlar (0)

Sonuc bulunamadi