adversarial-review
Health Warn
- License — License: Apache-2.0
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Low visibility — Only 5 GitHub stars
Code Warn
- Code scan incomplete — No supported source files were scanned during light audit
Permissions Pass
- Permissions — No dangerous permissions requested
This tool is a Claude Code skill that performs adversarial AI code and plan reviews. It routes your code or architecture plans to a secondary AI (OpenAI Codex) tasked with finding flaws, iterating until the reviewer approves.
Security Assessment
Overall risk: Medium. The tool functions as a bridge between two AI systems, meaning your code and context are sent to both Anthropic and OpenAI for processing. This inherently involves transmitting your codebase diffs over the network to external APIs. The setup relies on linking a local repository folder to your Claude environment. While no hardcoded secrets were found, the tool requires you to supply OpenAI API keys via environment variables. A light code scan could not be completed due to unsupported source files, meaning a deeper manual check for hidden vulnerabilities is not possible from the automated audit. No dangerous local permissions are requested.
Quality Assessment
The repository is brand new and has very low visibility, indicated by only 5 GitHub stars. However, it is fully active, with the most recent push happening today. It benefits from a standard Apache-2.0 license, making it safe for commercial and private use. The project is lightweight by design, acting as a set of Markdown instructions rather than a complex application, which minimizes the attack surface.
Verdict
Use with caution — while the project is active and safely licensed, the low community trust, lack of automated code scanning, and requirement to transmit local code to external AI APIs warrant a closer look before integrating into sensitive workflows.
Claude Code skill for adversarial AI code & plan review. One AI writes, another tears it apart. Iterative fix loop until approved.
Adversarial Review
Claude Code skill for adversarial AI code and plan review.
One AI writes the code. Another tears it apart. Iterate until approved.
What is this
Most AI code review tools validate your changes — "looks good, maybe add tests."
Adversarial review does the opposite: the reviewer defaults to skepticism
and tries to break confidence in the change. It looks for what will fail
in production, not what might be nice to improve.
This is a Claude Code skill
— a SKILL.md file plus a small references/runner.md that together
teach Claude how to run adversarial reviews through an external AI model
(currently OpenAI Codex).
Key features
- Plan review — review the plan BEFORE writing code. Catch architecture
mistakes, missing steps, and risks early - Code review — review the implementation. Bugs, security, data loss
- Code-vs-plan — verify the implementation matches the plan
- Iterative — Claude fixes issues based on reviewer feedback and resubmits
for re-review. Up to 5 rounds until approved - Lightweight —
SKILL.md+ onereferences/runner.md(thin runner
subagent spec), no server, no broker, no external runtime deps. Compare
with codex-plugin-cc:
~15 JS modules, App Server, JSON-RPC broker, lifecycle hooks
How it works
┌─────────┐ ┌──────────┐ ┌─────────┐
│ Claude │────>│ Reviewer │────>│ Claude │
│ (code) │ │ (Codex) │ │ (fix) │
└─────────┘ └──────────┘ └─────────┘
^ │
│ ┌──────────┐ │
└─────────│ Reviewer │<───────────┘
│(re-review)│
└──────────┘
│
VERDICT: APPROVED
Three modes
| Mode | What it reviews | When to use |
|---|---|---|
plan |
Implementation plan | Before writing code |
code |
Git diff (unstaged, staged, or branch) | After writing code |
code-vs-plan |
Code changes against the plan | Verify implementation matches plan |
Mode is auto-detected from context, or you can force it with an argument.
Quick start
1. Prerequisites
Claude Code and
OpenAI Codex CLI must be installed.
Verify both are available:
claude --version # Claude Code CLI
codex --version # OpenAI Codex CLI (>= 0.115.0)
If Codex is missing: npm install -g @openai/codex
Authentication. Codex needs an OpenAI account. Either:
- Sign in interactively:
codex(opens browser) - Or set
CODEX_API_KEYenv var for non-interactive use
2. Install the skill
git clone https://github.com/dementev-dev/adversarial-review.git
mkdir -p ~/.claude/skills
ln -sfn "$(pwd)/adversarial-review" ~/.claude/skills/adversarial-review
Verify both the skill entry-point AND the runner subagent spec are in place:
ls -la ~/.claude/skills/adversarial-review/SKILL.md
ls -la ~/.claude/skills/adversarial-review/references/runner.md
Migrating from a previous install at
~/.agents/skills/: delete
the old symlink (rm ~/.agents/skills/adversarial-review) and
install at the new path above. Claude Code ≥ 2.x uses~/.claude/skills/.
3. Add permissions
The skill runs git, codex exec, and writes temp files to /tmp.
Without pre-approved permissions, Claude Code will prompt for each action.
Where to add. Since the skill is installed globally
(~/.claude/skills/), permissions should go into the global config
so they work in any project:
| Install scope | Config file |
|---|---|
| Global (recommended) | ~/.claude/settings.json |
| Single project | <project>/.claude/settings.local.json |
Merge the following rules into the permissions.allow array of the
chosen config file:
// --- adversarial-review permissions ---
// Git: diff, status, branch detection, repo root, submodule check
"Bash(git diff*)",
"Bash(git status*)",
"Bash(git symbolic-ref*)",
"Bash(git rev-parse*)",
// Codex: initial launch (uses -C; prompt fed via cat | pipe for env portability)
"Bash(cat /tmp/codex-prompt-* | timeout 600 codex exec *)",
// Codex: resume (cd prefix because resume has no -C flag; prompt via cat | pipe)
"Bash(cd * && cat /tmp/codex-resume-prompt-* | timeout 600 codex exec resume *)",
// Session-id filesystem fallback (POSIX: find -newer + grep -l for content-match)
"Bash(find ~/.codex/sessions*)",
// Diagnostic aid when filesystem fallback finds nothing
"Bash(ls -t ~/.codex/sessions*)",
// Temp files: prompts (initial + resume), plans, review output, JSONL stdout, stderr
"Write(/tmp/codex-plan-*)",
"Write(/tmp/codex-prompt-*)",
"Write(/tmp/codex-resume-prompt-*)",
"Read(/tmp/codex-review-*)",
"Read(/tmp/codex-stdout-*)",
"Read(/tmp/codex-stderr-*)",
// Archive failed-resume diagnostics before fresh exec overwrites them
"Bash(mv /tmp/codex-stdout-* /tmp/codex-stdout-*-failed-resume.jsonl)",
"Bash(mv /tmp/codex-stderr-* /tmp/codex-stderr-*-failed-resume.txt)",
// Cleanup
"Bash(rm -f /tmp/codex-*)",
// Main thread: write prompt body for the runner subagent (NEW in refactor)
"Write(/tmp/codex-body-*)",
// Main thread: read the structured JSON result returned by the runner (NEW)
"Read(/tmp/codex-runner-result-*)",
// Runner subagent (inherited): read the prompt body main wrote (NEW)
"Read(/tmp/codex-body-*)",
// Runner subagent (inherited): write the result JSON main reads (NEW)
"Write(/tmp/codex-runner-result-*)",
// Runner-spec discovery: tier 1 (user-scoped install)
"Bash(ls ~/.claude/skills/adversarial-review/references/runner.md*)",
// Runner-spec discovery: tier 2 (plugin-marketplace cache)
"Bash(ls ~/.claude/plugins/cache/*/*/*/skills/adversarial-review/references/runner.md*)"
Full example (if the config file is empty or does not exist)
{
"permissions": {
"allow": [
// adversarial-review
"Bash(git diff*)",
"Bash(git status*)",
"Bash(git symbolic-ref*)",
"Bash(git rev-parse*)",
"Bash(cat /tmp/codex-prompt-* | timeout 600 codex exec *)",
"Bash(cd * && cat /tmp/codex-resume-prompt-* | timeout 600 codex exec resume *)",
"Bash(find ~/.codex/sessions*)",
"Bash(ls -t ~/.codex/sessions*)",
"Write(/tmp/codex-plan-*)",
"Write(/tmp/codex-prompt-*)",
"Write(/tmp/codex-resume-prompt-*)",
"Read(/tmp/codex-review-*)",
"Read(/tmp/codex-stdout-*)",
"Read(/tmp/codex-stderr-*)",
"Bash(mv /tmp/codex-stdout-* /tmp/codex-stdout-*-failed-resume.jsonl)",
"Bash(mv /tmp/codex-stderr-* /tmp/codex-stderr-*-failed-resume.txt)",
"Bash(rm -f /tmp/codex-*)",
"Write(/tmp/codex-body-*)",
"Read(/tmp/codex-runner-result-*)",
"Read(/tmp/codex-body-*)",
"Write(/tmp/codex-runner-result-*)",
"Bash(ls ~/.claude/skills/adversarial-review/references/runner.md*)",
"Bash(ls ~/.claude/plugins/cache/*/*/*/skills/adversarial-review/references/runner.md*)"
]
}
}
Security note: The codex exec rule allows any codex exec invocation
wrapped in timeout 600. The skill only uses read-only mode (-s read-only),
but Claude Code's permission patterns are prefix-based and cannot enforce flag
constraints. If you prefer tighter control, omit the codex exec rule and
approve each invocation manually.
4. Use
/adversarial-review # auto-detect mode
/adversarial-review plan # force plan review
/adversarial-review code # force code review
/adversarial-review path/to/f # review a specific file
/adversarial-review xhigh # higher reasoning effort
/adversarial-review model:gpt-5.3-codex # use a different model
Prompt architecture
The skill uses XML-structured prompts with adversarial stance:
<role>— adversarial reviewer, defaults to skepticism<operating_stance>— break confidence, not validate<attack_surface>— concrete checklist: auth, data integrity,
race conditions, rollback safety, schema drift, error handling, observability<finding_bar>— every finding must answer 4 questions:
what can go wrong, why vulnerable, impact, recommendation<scope_exclusions>— no style, naming, or speculative comments<calibration>— one strong finding > five weak ones
Example output
See examples/review-output.md for a sample review.
Troubleshooting
codex exec exits with model error.
Some models are unavailable with ChatGPT accounts (e.g. o3-mini).
The default gpt-5.4 works with both ChatGPT and API key auth.
Override with /adversarial-review model:<name>.
Permission prompts on every action.
Add the permissions from the setup section. Check that
the file is valid JSON and in the right location (project .claude/settings.local.json
or global ~/.claude/settings.json).
Codex hangs / timeout (exit code 124).
All codex exec calls are wrapped in timeout 600 (10 minutes). If you see
exit code 124, the reviewer did not respond in time. Retry — this is usually
transient.
Resume fails with session error.
The skill uses codex exec resume <session-id> for rounds 2+. On failure
(non-zero exit, thread/resume failed in stderr, or a malformed review),
the skill does NOT silently fall back. In an interactive session it asks
whether to run a fresh codex exec (higher token cost) or conclude the
review as NOT VERIFIED. In headless runs it decides based on the maximum
severity of the last successful round's findings: critical/high → fresh
exec; medium-only → conclude as NOT VERIFIED.
--json stdout is empty in my Claude Code session (session ID capture noise).
In some Claude Code sandbox configurations codex's --json event stream is
suppressed when stdout is redirected to a file — the /tmp/codex-stdout-*.jsonl
ends up 0 bytes even though the review itself (-o /tmp/codex-review-*.md)
completes correctly. The skill handles this automatically via a filesystem
fallback: every prompt includes a per-launch session marker
(<!-- ADVERSARIAL-REVIEW-SESSION: <REVIEW_ID>-<ATTEMPT_ID> -->) whereATTEMPT_ID is a fresh random integer regenerated for the initial exec,
every retry, every resume, and every fresh-exec fallback. The marker is
written to the rollout JSONL on disk. When the JSONL stream is empty, the
skill runs find ~/.codex/sessions -name 'rollout-*.jsonl' -newer <prompt-file> -exec grep -l <REVIEW_ID>-<ATTEMPT_ID> {} + to positively
identify this specific launch's rollout (not by newest-mtime, which would
be unsafe against parallel codex; not by review-stable ID alone, which
would match stale retry rollouts) and extracts the UUID from the filename.
Zero or multiple matches → the skill fails closed with a diagnostic rather
than silently picking. Resume continues to work normally. All commands are
POSIX (find -newer, -exec grep -l) and work identically on Linux and
macOS.
"NOT VERIFIED" result.
The skill applied fixes but the reviewer did not re-verify them (resume
failed or the operator chose to conclude). This is not an approval —
manually review the applied fixes before merging.
Running inside a git submodule.git rev-parse --show-toplevel returns the submodule path, not the parent
repo. The skill warns you and scopes the review to the submodule. If you
meant to review the parent, invoke the skill from the parent working tree.
Bare repository or not inside a work tree.
The skill aborts at Step 2 with a clear message. Run it from inside a
git working tree.
Plan Mode exits when writing temp files.
In Claude Code Plan Mode, writing to /tmp may trigger a permission prompt
or exit Plan Mode. This is a known Claude Code limitation. It does not affect
review correctness.
Known limitations
- Plan Mode and
/tmpwrites. Writing review prompts to/tmpmay trigger
a permission prompt or cause Plan Mode to exit. Does not affect review correctness. resumeinherits sandbox.codex exec resumedoes not accept-s—
sandbox is inherited from the original session (alwaysread-only).resumehas no-Cflag. The skill capturesREPO_ROOTviagit rev-parse --show-toplevelat Step 2 and prefixes every resume withcd '<REPO_ROOT>' && .... This requires paths without single quotes;
pathological paths (containing',",$, backtick, newline) cause
the skill to abort at Step 2.- Submodule scoping. When invoked inside a submodule, the review is
scoped to the submodule —git rev-parse --show-topleveldoes not walk
up to the parent. A warning is printed; invoke from the parent repo if
you want parent scope. - macOS end-to-end not tested. The secondary session-id capture
uses only POSIX flags (find -newer FILE,-exec CMD {} +,grep -l),
so it should work identically on macOS as on Linux, but the skill has
not been end-to-end tested on macOS.
Roadmap
- Gemini as alternative reviewer backend
- Local model support (Ollama, llama.cpp)
- CI integration (GitHub Actions)
- Multi-reviewer mode (parallel review by multiple models)
Inspiration
Adversarial prompt structure developed after studying
openai/codex-plugin-cc (Apache-2.0).
Borrowed ideas: XML-structured prompts, adversarial stance, attack surface
checklist, finding bar, calibration rules.
What we did differently:
- Iterative loop — Claude fixes issues and resubmits (not "stop and ask user")
- Plan review — reviews plans before code, not just code
- Minimal install —
SKILL.md+ onereferences/runner.md, no
server, no broker, vs 15+ JS modules - Verbatim output — reviewer findings shown as-is, not rephrased
License
Apache-2.0 — see LICENSE.
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found