numasec

agent
Guvenlik Denetimi
Uyari
Health Gecti
  • License — License: MIT
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Community trust — 26 GitHub stars
Code Uyari
  • process.env — Environment variable access in agent/github/index.ts
  • network request — Outbound network request in agent/github/index.ts
Permissions Gecti
  • Permissions — No dangerous permissions requested
Purpose
This tool is an AI-driven penetration testing agent that runs in your terminal. It uses a team of specialized AI models to automate vulnerability scanning, chain attacks together, and generate security reports for a given target.

Security Assessment
The overall risk is Medium. As an offensive security tool, it is explicitly designed to execute shell commands and make extensive outbound network requests against user-provided targets. The automated scan also relies on environment variable access, likely to securely handle API keys for the Large Language Models (Claude, GPT-4, etc.) it integrates with. There are no hardcoded secrets, and the tool does not request inherently dangerous host system permissions. However, because it autonomously executes dynamic attacks, improper use or misconfiguration could easily lead to unauthorized access or damage to unauthorized targets.

Quality Assessment
The project is in excellent health and actively maintained, with repository updates pushed as recently as today. It is fully open source under the standard MIT license. Community trust is currently low-to-moderate, boasting 26 GitHub stars, but this is typical for a specialized cybersecurity utility. It provides flexible installation options and includes automated CI workflows for building and testing.

Verdict
Use with caution—while the application itself is safe to install, users must be careful to only direct its automated offensive capabilities at systems they have explicit authorization to test.
SUMMARY

Your AI Cyber Security companion. Open source. Runs in the terminal.

README.md

numasec

AI pentester that actually finds vulnerabilities. Open source. Runs in your terminal.

numasec running a pentest against OWASP Juice Shop

MIT License Python 3.11+ Build Release PyPI

96% vulnerability recall on OWASP Juice Shop · 10 specialized agents · 21 security tools · PTES methodology


Quickstart

pip install numasec
numasec

Or with Docker:

docker run -it francescosta/numasec

Or from source:

curl -fsSL https://numasec.dev/install | bash

Type /target https://yourapp.com and watch it work. The AI scans, finds vulnerabilities, chains attacks together, and writes the report. You watch, approve, and steer.

Works with Claude, GPT-4, Gemini, DeepSeek, Mistral, or any OpenAI-compatible model.


Why numasec

Most "AI security tools" wrap a single scanner and call it AI. numasec is different — it's a team of 10 specialized agents running 21 offensive security tools through an actual penetration testing methodology.

It doesn't just find vulnerabilities. It chains them: a leaked API key in JavaScript → SSRF → cloud metadata → account takeover. Then it writes a professional report with CVSS scores, CWE IDs, OWASP categories, and remediation guidance.

Benchmarked against real targets:

Target Vulnerabilities Found Coverage
OWASP Juice Shop v17 25/26 ground-truth vulns 96% recall
DVWA 7/7 vulnerability categories 100%
WebGoat 20+ vulnerabilities across all modules Full coverage

What it finds

Injection

  • SQL injection (blind, time-based, union, error-based)
  • NoSQL injection
  • OS command injection
  • Server-Side Template Injection
  • XXE injection
  • GraphQL introspection & injection

Authentication & Access

  • JWT attacks (alg:none, weak HS256, kid traversal)
  • OAuth misconfiguration
  • Default credentials & password spray
  • IDOR
  • CSRF
  • Privilege escalation

Client & Server Side

  • XSS (reflected, stored, DOM)
  • SSRF with cloud metadata detection
  • CORS misconfiguration
  • Path traversal / LFI
  • Open redirect
  • HTTP request smuggling
  • Race conditions
  • File upload bypass

Every finding is auto-enriched with CWE ID, CVSS 3.1 score, OWASP Top 10 category, MITRE ATT&CK technique, and actionable remediation guidance.


Multi-Agent Architecture

numasec isn't a single bot — it's a coordinated team of specialized agents, each with distinct roles and permissions:

Primary Agents

Agent Role What it does
🔴 pentest Full PTES methodology Recon → Discovery → Vuln Assessment → Exploitation → Reporting
🔵 recon Intelligence gathering Port scanning, fingerprinting, subdomain enum, service probing — no exploitation
🟠 hunt OWASP Top 10 hunter Systematic, aggressive testing across all 10 OWASP categories
🟡 review Secure code review Static analysis of source code, diffs, commits, PRs
🟢 report Report & findings Finding management, severity validation, report generation

Subagents

Agent Role
scanner Executes automated vulnerability scans (passive → semi-active → active)
analyst Validates results, eliminates false positives, correlates attack chains
reporter Generates SARIF / Markdown / HTML / JSON reports
explore CVE research, exploit documentation, knowledge base queries

Each agent has tailored permissions — the recon agent can't run exploits, the review agent can't launch scanners. The analyst agent filters false positives using strict evidence criteria before any finding enters the report.


Security Tooling

21 purpose-built security tools and 38 async scanners under the hood — covering reconnaissance, injection testing, authentication attacks, access control, file upload bypass, race conditions, request smuggling, out-of-band detection, and more. The AI selects and orchestrates them automatically based on what it discovers about your target.

A built-in knowledge base of 34 templates covers detection patterns, exploitation techniques, payloads, and remediation — so the AI doesn't hallucinate attack methodology, it looks it up. Extensible with your own templates and plugins.


Reports

Four output formats, all auto-generated:

Format Use case
SARIF Drop into GitHub Code Scanning, GitLab SAST, or any SARIF viewer
HTML Self-contained report to share with your team
Markdown Paste into tickets, docs, or wikis
JSON Feed into your pipeline or dashboard

Every report includes an executive summary with risk score (0-100), severity breakdown, OWASP coverage matrix, attack chain documentation, and per-finding remediation.


OWASP Top 10 Coverage

The TUI header tracks real-time testing coverage across all 10 OWASP categories as the pentest progresses. Each category is automatically mapped to the relevant tools — so you always know what's been tested and what's left.


Installation

pip (recommended)

pip install numasec
numasec

Downloads the TUI binary automatically on first run. No Bun, Node, or other runtime needed.

Docker

docker run -it francescosta/numasec

Full TUI + all 21 security tools. Multi-arch (amd64, arm64).

From source

curl -fsSL https://numasec.dev/install | bash

Or manually:

git clone https://github.com/FrancescoStabile/numasec.git
cd numasec
pip install -e ".[all]"    # Python backend
cd agent && bun install && bun run build  # TUI

Usage

numasec                  # Start interactive TUI

Slash Commands

Command Description
/target <url> Set target and begin reconnaissance
/findings List all discovered vulnerabilities
/report <format> Generate report (markdown, html, sarif, json)
/coverage Show OWASP Top 10 coverage matrix
/creds List discovered credentials
/evidence <id> Show evidence for a specific finding
/review Security review of code changes
/init Analyze app and create security profile

Agent Modes

Switch between agents for different tasks:

  • pentest — full methodology, default
  • recon — reconnaissance only, no exploitation
  • hunt — aggressive OWASP Top 10 testing
  • review — secure code review (no network scanning)
  • report — finding management and deliverables

LLM Providers

Provider Models
Anthropic Claude Opus, Sonnet, Haiku
OpenAI GPT-4o, GPT-4, o1
Google Gemini Pro, Flash
AWS Bedrock Claude, Llama
Azure OpenAI GPT-4, GPT-4o
Mistral Large, Medium
DeepSeek V2, Coder
OpenRouter Any model via aggregation
GitHub Copilot Copilot models
Google Vertex Gemini via Vertex
GitLab GitLab models

Development

pip install -e ".[all]"

# Tests (1273 unit + 3 benchmark suites)
pytest tests/ -v
pytest tests/ -m "not slow and not benchmark"   # fast run

# Lint & type check
ruff check numasec/
ruff format numasec/
mypy numasec/

# TypeScript TUI
cd agent && bun install
cd packages/numasec && bun run typecheck
cd packages/numasec && bun test

Benchmarks

# Juice Shop (96% recall)
JUICE_SHOP_URL=http://localhost:3000 pytest tests/benchmarks/test_juice_shop.py -v

# DVWA (100% coverage)
DVWA_TARGET=http://localhost:8080 pytest tests/benchmarks/test_dvwa.py -v

# WebGoat
WEBGOAT_TARGET=http://localhost:8081/WebGoat pytest tests/benchmarks/test_webgoat.py -v

Extend with plugins

Drop a Python file with a register(registry) function into ~/.numasec/plugins/ or a YAML scanner template into ~/.numasec/templates/.


How it works

┌─────────────────────────────────────────────────────────────┐
│                        Terminal TUI                         │
│  (TypeScript/Bun • SolidJS reactive UI • 5 agent modes)     │
└────────────────────────────┬────────────────────────────────┘
                             │ 
┌────────────────────────────▼────────────────────────────────┐
│                    Security Engine                          │
│  ┌─────────────┐  ┌───────────────┐  ┌───────────────────┐  │
│  │ 21 Security │  │ 34 Knowledge  │  │  Session Store    │  │
│  │ Tools       │  │ Base Templates│  │                   │  │
│  └──────┬──────┘  └───────────────┘  └───────────────────┘  │
│         │                                                   │
│  ┌──────▼──────────────────────────────────────────────┐    │
│  │            38 Skills                                │    │
│  │  Injection · Auth · Access · Recon · Fuzzing        │    │
│  │  Client-side · Server-side · Out-of-band · ...      │    │
│  └─────────────────────────────────────────────────────┘    │
└─────────────────────────────────────────────────────────────┘

The TUI drives the AI conversation. The AI calls security tools. Each tool orchestrates one or more async scanners. Findings are auto-enriched (CWE → CVSS → OWASP → MITRE ATT&CK), deduplicated, and grouped into attack chains. Reports are generated from the session store.

No hallucinated methodology. The knowledge base provides real detection patterns, exploitation techniques, and payloads. The deterministic planner (based on the CHECKMATE paper) selects tests based on detected technologies — no LLM involved in test selection.


Built by Francesco Stabile.

LinkedIn
X

MIT License

Yorumlar (0)

Sonuc bulunamadi