🧨 AI Pentest Playbook

The field manual for pentesting AI chatbots & LLM-powered apps

🚀 Jump straight to the Payload Library →

🎯 Working on an AI/LLM Security Engagement?

Use this handbook as your practical field guide throughout the entire assessment lifecycle—from initial reconnaissance against chatbots and AI-powered applications to advanced testing of prompt injection, jailbreak techniques, tool abuse, agent manipulation, and data exfiltration scenarios involving MCP, RAG, and computer-use systems.

Each chapter combines battle-tested attack payloads with clear guidance on expected success indicators, severity considerations, detection opportunities, and remediation recommendations. This enables both offensive and defensive teams to understand not only how an attack works, but also how to identify, mitigate, and prevent it.

The handbook covers the complete OWASP Top 10 for Large Language Model Applications, while also exploring emerging attack surfaces that extend beyond current industry frameworks.

Content is curated from a combination of public research, vendor disclosures, CVEs, academic papers, real-world incidents, and active red-team engagements, providing a comprehensive reference for modern AI security testing.

⚠️ For authorized testing only.

👉 Start Here — Open the Payload Library Master Index → `PAYLOADS.md`

Every payload on one page, grouped by attack class — copy-paste ready, full sets one click away. No digging through folders; it's all reachable from the master index.

Attack class	Attack class
Prompt Injection	Insecure Output Handling
Jailbreaks	Code-Interpreter RCE
System-Prompt Extraction	Training-Data & Memory Extraction
Encoding / Obfuscation Bypass	Model Denial-of-Service
Indirect & Multi-Modal Injection	Agent / Tool Abuse

For technique-level coverage with detection and mitigation, browse the chapters below.

Chapters

Foundations

#	Chapter
01	Recon & Fingerprinting — model ID, system-prompt detection, tool & architecture inference

OWASP LLM Top 10

#	Chapter	OWASP
02	Prompt Injection	LLM01
03	Insecure Output Handling	LLM02
04	Training Data Poisoning	LLM03
05	Model Denial of Service	LLM04
06	Supply Chain Vulnerabilities	LLM05
07	Sensitive Information Disclosure	LLM06
08	Insecure Plugin / Tool Design	LLM07
09	Excessive Agency	LLM08
10	Overreliance / Hallucination Exploitation	LLM09
11	Model Theft / Extraction	LLM10

Beyond OWASP

#	Chapter
12	Jailbreaking Techniques — DAN, STAN, AIM, crescendo, many-shot, token smuggling
13	MCP / Agentic Attack Surface — tool poisoning, MCPoison, rug-pull MCP
14	RAG & Vector Store Attacks — retrieval poisoning, embedding inversion
15	Indirect Prompt Injection — web, document, email injection
16	Tools & Automation — Garak, PyRIT, Burp, MCP inspector

Frontier Vectors

#	Chapter
17	A2A Protocol Attacks — agent-to-agent spoofing, capability inflation
18	Computer-Use Agent Attacks — browser-use, Operator, screen-injection
19	Sycophancy Exploitation — confidence flips, reward-model gaming
20	Memory Poisoning — persistent context attacks, cross-session bleed
21	Function-Calling Abuse — schema injection, parallel-tool races
22	Voice / Audio Assistant Attacks — cloning, replay, ultrasonic injection

Payload Library — by goal

Full one-page index with inline top picks: PAYLOADS.md

Goal	Chapter	Payload set
Extract system instructions	07	`system_prompt_extraction.md`
Inject / override instructions	02	`prompt_injection.md`
Bypass safety policy	12	`jailbreaks.md`
Encoding / obfuscation bypass	12	`encoding_bypass.md`
Attack via planted content (web / docs / RAG / media)	15	`indirect_injection.md`
Exploit downstream renderer (XSS / SSTI / SQLi / RCE)	03	`insecure_output_handling.md`
Escalate a code / Python tool to RCE	08	`code_interpreter_rce.md`
Steal training data / memory / PII	07	`data_extraction.md`
Exhaust resources / drain budget	05	`model_dos.md`
Abuse agent actions / SSRF / tools	09	`agent_tool_abuse.md`

Repository Layout

AI-Pentest-Playbook/
├── PAYLOADS.md     ⭐ one-page payload index — start here
├── payloads/       the payloads, grouped by attack class
├── docs/           technique chapters (detection + mitigation)
├── scripts/        runner · burp-export · master-csv
├── README.md
└── CONTRIBUTING.md

💜 Found This Useful? Give Back.

This handbook gets sharper every time a real-world finding is distilled back into reusable knowledge.

🛠 Got a new payload, defence pattern, or attack surface that should be a chapter?

Most payload PRs merge within a day — new chapters and defensive notes are very welcome. A ⭐ helps other AppSec / red-team folks find the handbook.

🐛 Spotted something wrong, missing, or outdated? → Open an issue. Accuracy beats completeness.

Citation

@misc{aihackershandbook,
  author       = {4vanish and contributors},
  title        = {AI Hacker's Handbook: A playbook for pentesting AI chatbots and LLM-powered applications},
  publisher    = {GitHub},
  howpublished = {\url{https://github.com/4vanish/AI-Pentest-Playbook}}
}

Author

Built and maintained by Avanish Pathak.