Adversarial Review

Claude Code skill for adversarial AI code and plan review.

One AI writes the code. Another tears it apart. Iterate until approved.

What is this

Most AI code review tools validate your changes — "looks good, maybe add tests."
Adversarial review does the opposite: the reviewer defaults to skepticism
and tries to break confidence in the change. It looks for what will fail
in production, not what might be nice to improve.

This is a Claude Code skill
— a SKILL.md file plus a small references/runner.md that together
teach Claude how to run adversarial reviews through an external AI model
(currently OpenAI Codex).

Key features

Plan review — review the plan BEFORE writing code. Catch architecture
mistakes, missing steps, and risks early
Code review — review the implementation. Bugs, security, data loss
Code-vs-plan — verify the implementation matches the plan
Iterative — Claude fixes issues based on reviewer feedback and resubmits
for re-review. Up to 5 rounds until approved
Lightweight — SKILL.md + one references/runner.md (thin runner
subagent spec), no server, no broker, no external runtime deps. Compare
with codex-plugin-cc:
~15 JS modules, App Server, JSON-RPC broker, lifecycle hooks

How it works

┌─────────┐     ┌──────────┐     ┌─────────┐
│  Claude  │────>│ Reviewer  │────>│  Claude  │
│  (code)  │     │ (Codex)   │     │  (fix)   │
└─────────┘     └──────────┘     └─────────┘
     ^                                 │
     │         ┌──────────┐            │
     └─────────│ Reviewer  │<───────────┘
               │(re-review)│
               └──────────┘
                    │
              VERDICT: APPROVED

Three modes

Mode	What it reviews	When to use
`plan`	Implementation plan	Before writing code
`code`	Git diff (unstaged, staged, or branch)	After writing code
`code-vs-plan`	Code changes against the plan	Verify implementation matches plan

Mode is auto-detected from context, or you can force it with an argument.

Quick start

1. Prerequisites

Claude Code and
OpenAI Codex CLI must be installed.

Verify both are available:

claude --version   # Claude Code CLI
codex --version    # OpenAI Codex CLI (>= 0.115.0)

If Codex is missing: npm install -g @openai/codex

Authentication. Codex needs an OpenAI account. Either:

Sign in interactively: codex (opens browser)
Or set CODEX_API_KEY env var for non-interactive use

2. Install the skill

git clone https://github.com/dementev-dev/adversarial-review.git
mkdir -p ~/.claude/skills
ln -sfn "$(pwd)/adversarial-review" ~/.claude/skills/adversarial-review

Verify both the skill entry-point AND the runner subagent spec are in place:

ls -la ~/.claude/skills/adversarial-review/SKILL.md
ls -la ~/.claude/skills/adversarial-review/references/runner.md

Migrating from a previous install at ~/.agents/skills/: delete
the old symlink (rm ~/.agents/skills/adversarial-review) and
install at the new path above. Claude Code ≥ 2.x uses ~/.claude/skills/.

3. Add permissions

The skill runs git, codex exec, and writes temp files to /tmp.
Without pre-approved permissions, Claude Code will prompt for each action.

Where to add. Since the skill is installed globally
(~/.claude/skills/), permissions should go into the global config
so they work in any project:

Install scope	Config file
Global (recommended)	`~/.claude/settings.json`
Single project	`<project>/.claude/settings.local.json`

Merge the following rules into the permissions.allow array of the
chosen config file:

// --- adversarial-review permissions ---
// Git: diff, status, branch detection, repo root, submodule check
"Bash(git diff*)",
"Bash(git status*)",
"Bash(git symbolic-ref*)",
"Bash(git rev-parse*)",
// Codex: initial launch (uses -C; prompt fed via cat | pipe for env portability)
"Bash(cat /tmp/codex-prompt-* | timeout 600 codex exec *)",
// Codex: resume (cd prefix because resume has no -C flag; prompt via cat | pipe)
"Bash(cd * && cat /tmp/codex-resume-prompt-* | timeout 600 codex exec resume *)",
// Session-id filesystem fallback (POSIX: find -newer + grep -l for content-match)
"Bash(find ~/.codex/sessions*)",
// Diagnostic aid when filesystem fallback finds nothing
"Bash(ls -t ~/.codex/sessions*)",
// Temp files: prompts (initial + resume), plans, review output, JSONL stdout, stderr
"Write(/tmp/codex-plan-*)",
"Write(/tmp/codex-prompt-*)",
"Write(/tmp/codex-resume-prompt-*)",
"Read(/tmp/codex-review-*)",
"Read(/tmp/codex-stdout-*)",
"Read(/tmp/codex-stderr-*)",
// Archive failed-resume diagnostics before fresh exec overwrites them
"Bash(mv /tmp/codex-stdout-* /tmp/codex-stdout-*-failed-resume.jsonl)",
"Bash(mv /tmp/codex-stderr-* /tmp/codex-stderr-*-failed-resume.txt)",
// Cleanup
"Bash(rm -f /tmp/codex-*)",
// Main thread: write prompt body for the runner subagent (NEW in refactor)
"Write(/tmp/codex-body-*)",
// Main thread: read the structured JSON result returned by the runner (NEW)
"Read(/tmp/codex-runner-result-*)",
// Runner subagent (inherited): read the prompt body main wrote (NEW)
"Read(/tmp/codex-body-*)",
// Runner subagent (inherited): write the result JSON main reads (NEW)
"Write(/tmp/codex-runner-result-*)",
// Runner-spec discovery: tier 1 (user-scoped install)
"Bash(ls ~/.claude/skills/adversarial-review/references/runner.md*)",
// Runner-spec discovery: tier 2 (plugin-marketplace cache)
"Bash(ls ~/.claude/plugins/cache/*/*/*/skills/adversarial-review/references/runner.md*)"

Full example (if the config file is empty or does not exist)

{
  "permissions": {
    "allow": [
      // adversarial-review
      "Bash(git diff*)",
      "Bash(git status*)",
      "Bash(git symbolic-ref*)",
      "Bash(git rev-parse*)",
      "Bash(cat /tmp/codex-prompt-* | timeout 600 codex exec *)",
      "Bash(cd * && cat /tmp/codex-resume-prompt-* | timeout 600 codex exec resume *)",
      "Bash(find ~/.codex/sessions*)",
      "Bash(ls -t ~/.codex/sessions*)",
      "Write(/tmp/codex-plan-*)",
      "Write(/tmp/codex-prompt-*)",
      "Write(/tmp/codex-resume-prompt-*)",
      "Read(/tmp/codex-review-*)",
      "Read(/tmp/codex-stdout-*)",
      "Read(/tmp/codex-stderr-*)",
      "Bash(mv /tmp/codex-stdout-* /tmp/codex-stdout-*-failed-resume.jsonl)",
      "Bash(mv /tmp/codex-stderr-* /tmp/codex-stderr-*-failed-resume.txt)",
      "Bash(rm -f /tmp/codex-*)",
      "Write(/tmp/codex-body-*)",
      "Read(/tmp/codex-runner-result-*)",
      "Read(/tmp/codex-body-*)",
      "Write(/tmp/codex-runner-result-*)",
      "Bash(ls ~/.claude/skills/adversarial-review/references/runner.md*)",
      "Bash(ls ~/.claude/plugins/cache/*/*/*/skills/adversarial-review/references/runner.md*)"
    ]
  }
}

Security note: The codex exec rule allows any codex exec invocation
wrapped in timeout 600. The skill only uses read-only mode (-s read-only),
but Claude Code's permission patterns are prefix-based and cannot enforce flag
constraints. If you prefer tighter control, omit the codex exec rule and
approve each invocation manually.

4. Use

/adversarial-review                       # auto-detect mode
/adversarial-review plan                  # force plan review
/adversarial-review code                  # force code review
/adversarial-review path/to/f             # review a specific file
/adversarial-review xhigh                 # higher reasoning effort
/adversarial-review model:gpt-5.3-codex   # use a different model

Prompt architecture

The skill uses XML-structured prompts with adversarial stance:

<role> — adversarial reviewer, defaults to skepticism
<operating_stance> — break confidence, not validate
<attack_surface> — concrete checklist: auth, data integrity,
race conditions, rollback safety, schema drift, error handling, observability
<finding_bar> — every finding must answer 4 questions:
what can go wrong, why vulnerable, impact, recommendation
<scope_exclusions> — no style, naming, or speculative comments
<calibration> — one strong finding > five weak ones

Example output

See examples/review-output.md for a sample review.

Troubleshooting

codex exec exits with model error.
Some models are unavailable with ChatGPT accounts (e.g. o3-mini).
The default gpt-5.4 works with both ChatGPT and API key auth.
Override with /adversarial-review model:<name>.

Permission prompts on every action.
Add the permissions from the setup section. Check that
the file is valid JSON and in the right location (project .claude/settings.local.json
or global ~/.claude/settings.json).

Codex hangs / timeout (exit code 124).
All codex exec calls are wrapped in timeout 600 (10 minutes). If you see
exit code 124, the reviewer did not respond in time. Retry — this is usually
transient.

Resume fails with session error.
The skill uses codex exec resume <session-id> for rounds 2+. On failure
(non-zero exit, thread/resume failed in stderr, or a malformed review),
the skill does NOT silently fall back. In an interactive session it asks
whether to run a fresh codex exec (higher token cost) or conclude the
review as NOT VERIFIED. In headless runs it decides based on the maximum
severity of the last successful round's findings: critical/high → fresh
exec; medium-only → conclude as NOT VERIFIED.

--json stdout is empty in my Claude Code session (session ID capture noise).
In some Claude Code sandbox configurations codex's --json event stream is
suppressed when stdout is redirected to a file — the /tmp/codex-stdout-*.jsonl
ends up 0 bytes even though the review itself (-o /tmp/codex-review-*.md)
completes correctly. The skill handles this automatically via a filesystem
fallback: every prompt includes a per-launch session marker
() where
ATTEMPT_ID is a fresh random integer regenerated for the initial exec,
every retry, every resume, and every fresh-exec fallback. The marker is
written to the rollout JSONL on disk. When the JSONL stream is empty, the
skill runs find ~/.codex/sessions -name 'rollout-*.jsonl' -newer <prompt-file> -exec grep -l <REVIEW_ID>-<ATTEMPT_ID> {} + to positively
identify this specific launch's rollout (not by newest-mtime, which would
be unsafe against parallel codex; not by review-stable ID alone, which
would match stale retry rollouts) and extracts the UUID from the filename.
Zero or multiple matches → the skill fails closed with a diagnostic rather
than silently picking. Resume continues to work normally. All commands are
POSIX (find -newer, -exec grep -l) and work identically on Linux and
macOS.

"NOT VERIFIED" result.
The skill applied fixes but the reviewer did not re-verify them (resume
failed or the operator chose to conclude). This is not an approval —
manually review the applied fixes before merging.

Running inside a git submodule.
git rev-parse --show-toplevel returns the submodule path, not the parent
repo. The skill warns you and scopes the review to the submodule. If you
meant to review the parent, invoke the skill from the parent working tree.

Bare repository or not inside a work tree.
The skill aborts at Step 2 with a clear message. Run it from inside a
git working tree.

Plan Mode exits when writing temp files.
In Claude Code Plan Mode, writing to /tmp may trigger a permission prompt
or exit Plan Mode. This is a known Claude Code limitation. It does not affect
review correctness.

Known limitations

Plan Mode and /tmp writes. Writing review prompts to /tmp may trigger
a permission prompt or cause Plan Mode to exit. Does not affect review correctness.
resume inherits sandbox. codex exec resume does not accept -s —
sandbox is inherited from the original session (always read-only).
resume has no -C flag. The skill captures REPO_ROOT via
git rev-parse --show-toplevel at Step 2 and prefixes every resume with
cd '<REPO_ROOT>' && .... This requires paths without single quotes;
pathological paths (containing ', ", $, backtick, newline) cause
the skill to abort at Step 2.
Submodule scoping. When invoked inside a submodule, the review is
scoped to the submodule — git rev-parse --show-toplevel does not walk
up to the parent. A warning is printed; invoke from the parent repo if
you want parent scope.
macOS end-to-end not tested. The secondary session-id capture
uses only POSIX flags (find -newer FILE, -exec CMD {} +, grep -l),
so it should work identically on macOS as on Linux, but the skill has
not been end-to-end tested on macOS.

Roadmap

Gemini as alternative reviewer backend
Local model support (Ollama, llama.cpp)
CI integration (GitHub Actions)
Multi-reviewer mode (parallel review by multiple models)

Inspiration

Adversarial prompt structure developed after studying
openai/codex-plugin-cc (Apache-2.0).

Borrowed ideas: XML-structured prompts, adversarial stance, attack surface
checklist, finding bar, calibration rules.

What we did differently:

Iterative loop — Claude fixes issues and resubmits (not "stop and ask user")
Plan review — reviews plans before code, not just code
Minimal install — SKILL.md + one references/runner.md, no
server, no broker, vs 15+ JS modules
Verbatim output — reviewer findings shown as-is, not rephrased

License

Apache-2.0 — see LICENSE.