vibe-dev-plugin

agent
Security Audit
Warn
Health Warn
  • No license — Repository has no license file
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Low visibility — Only 5 GitHub stars
Code Pass
  • Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Pass
  • Permissions — No dangerous permissions requested

No AI report is available for this listing yet.

SUMMARY

Vibe Dev — harness-first pipeline для Claude Code: от бизнес-идеи до работающего продукта. 20 проверяемых механизмов (hooks из коробки) + онбординг под пользователя (/setup).

README.md

Vibe Dev v6

🌐 English: this file · Русский: README.ru.md

A harness-first plugin that turns a business idea into a shipped product — for founders who build with Codex and Claude Code.

Vibe Dev is built for entrepreneurs who don't write code but ship real products with AI
agents. You stay at the level of business and architecture; the agent makes the technical
decisions and does the work. The point of the plugin is to make the agent reliable — so
"done" means done, not "the code compiled."

"The harness is enforcement, not documentation."

Every principle is backed by a real mechanism (hook / gate / agent / self-check), not a
line in an instruction file that the agent can quietly ignore. Discipline is broken by
exactly the link that's supposed to keep it — the agent itself. So the rules are turned into
checkpoints that are actually enforced.

The number of mechanisms and their live status live in docs/traceability.md — the single
source of truth (42 tracked today; 2 of them — the screen-layer jargon catcher and the
secret-output mask — are honestly marked display-only/partial, i.e. not real enforcement).
Each mechanism carries three attributes: where it's defined / what enforces it / what happens
if you try to bypass it
. The plugin's self-check verifies completeness — a claim without a live
mechanism doesn't pass.
New in v6.2: hook activation became a provable fact (a guard that "didn't turn on" can no
longer stay silent), and clarity of the final message went from a wish to a blocking gate.
Every new guard was verified with live runs on the Claude Code 2.1.170 engine.


Who it's for

  • Founders and non-engineers who want to ship a working product, not learn to code.
  • People who already work with Codex and Claude Code and want the agent to behave like
    a disciplined senior engineer instead of an eager intern.
  • Anyone tired of agents that declare "done" on code that was never actually run.

You describe the business. The agent picks the stack, writes the code, tests it, and only
reports "done" when a verification command passed and the behavior matched expectations.


How it works (the harness in plain words)

An AI coding agent announces key moments: "about to save a file," "about to run a command,"
"showing a message to the human," "opening/closing a session." The plugin attaches small
inspectors to those moments
(hooks). Each inspector looks at the intent and returns one
verdict:

  • block — the action is cancelled (e.g. you can't mark a feature "done" without evidence);
  • warn / inject — the action proceeds, but a note appears for the agent or a flag for the human;
  • pass — all clear, stay silent.

The map of "on this event, call this inspector" lives in hooks/hooks.json and is loaded
automatically
on install (Claude Code v2.1+) — no manual wiring. Strictness is per-project:
minimal / standard / strict (existing projects aren't broken — they're migrated with
/upgrade-project).


What it catches (key mechanisms)

A. "Done" means verified, not claimed

Mechanism What it catches What it does
UI-evidence gate a UI feature is marked "done" on typecheck/tests, but a real click shows nothing block (a screenshot / live run is required)
Surface-aware evidence (v6.2) a "no-UI" feature (API / scheduled job / CLI) is closed with no trace of a real call; a UI feature hides as a "library" the surface is inferred from files and can only tighten: ui → block, others → warn with an acceptance recipe
Test-strategy before build a medium/large feature goes into work without a thought-through verification plan block (no docs/test-strategy.md → it can't enter active)
Data-model review gate a DB schema is written without a separate critical review (the model "freezes," reworks are expensive) block (no docs/data-model-review.md → it can't enter active)
State-machine transitions a feature jumps to an invalid state / a corrupted state file block (current project) / warn (legacy)

A2. Hook activation as a provable fact (new in v6.2)

Mechanism What it catches What it does
Heartbeat hooks "look installed" but don't physically run (silent strictness theater) every live event writes a stamp with the version; readers check freshness
Two-phase profile profile says "strict" but enforcement never turned on bootstrap writes pending-strict; only a live hook promotes it to real strict — the promotion is the proof
Git pre-commit backstop the plugin was removed/broken and nobody noticed an INDEPENDENT post in .git/hooks: a pending profile or stale heartbeat → block the commit
Fail-loud + crash artifacts a guard crashed and silently "allowed everything" (a real bug, 2026-06-06) crash → loud warning + crash log + a probe at session start
Real-shape fixture corpus a gate green on synthetic data, broken on real files self-check runs gates against 6 anonymized real feature_list files
/doctor "why are the guards silent?" self-diagnosis: profile / heartbeat / crashes / install + a fix table

B. Safety and money

Mechanism What it catches What it does
Bulk-API gate a mass external-API job with no limit check (real case: a project banned for 2 days + wasted money) block without a pre-launch checklist (the checklist now requires explicit volume × price)
Model-swap guard an edit introduces a model / setting that affects every answer (real case: 3 days of dropped client replies after "newer = drop-in") warn "this is a contract change, run a smoke test"
Vendor-lock research gate a specific provider is hard-wired into the architecture blindly, with no comparison block an integration feature without docs/research/*.md
Secret-in-prompt (v6.2) the user pasted a live key into a message warn: the key is compromised → rotate + move to .env
Secret-in-output (v6.2) a CLI printed a token — it lingers in the session context warn to the model: don't reuse the literal, suggest rotation (+ output masking on engines that support it)
Concurrent-write advisory two sessions write to one file (real case: data loss) warn (advisory)

C. Anti-stall

Mechanism What it catches What it does
User stop-signal the human writes "wrong way / stop / that's not it" and the agent keeps grinding tactically inject "change the level, not the method; launch a diagnostic subagent"
Interrupt-recovery (v6.2.1) a dropped connection (closed laptop lid) or an inbound message kills the running tool — the system falsely logs "user rejected," and the agent stalls for hours the next message without a stop-word → inject "that was a disconnect, not a veto — continue the plan"; a real "stop" keeps its force
Repeated-failure detector the same command is launched a 3rd time in a row with no success and no structural change warn before running: prompt for a diagnostic subagent (carrier verified against the live 2.1.170 event model)

D. Plain language (the non-engineer's biggest pain)

Mechanism What it catches What it does
Clarity gate on the final turn (v6.2) the turn ends with a person-days estimate or heavy jargon outside code blocks block: the agent must add a plain-words version (≤10 lines); precision is held by a labeled corpus from real sessions + append limits
Jargon catcher (screen layer) jargon / a fork with no "what you lose" / person-days in any message on-screen flag + a log metric (honestly display-only; on Desktop the event doesn't fire — the load-bearing layer is the gate above)
Onboarding (/setup) the system doesn't know how to talk to a new user a portrait at ~/.vibe-dev/portrait.md → gate strictness and fork format adapt (no portrait → a neutral default)

E. Process discipline

Mechanism What it catches What it does
WIP=1 / scope edits spill outside the declared feature block the commit (diff ⊆ affected_files)
Intent-without-action the agent ends a turn saying "I'll now do X" with no action taken block (continue the turn)
Unified Stop dispatcher (v6.2) several end-of-turn guards cascade blocks and loop the turn priorities + a shared cap of ≤3 blocks per turn; overflow → pass with a log entry
Architecture research gate (v6.2) architecture is written without studying best practices and existing solutions block writing ARCHITECTURE*.md without docs/research/*; the skip is allowed ONLY by an explicit user phrase
Closing mode (v6.2) "let's close the session" → the agent suddenly starts coding rights degrade: writes only to state files; new work → backlog; lifted by a normal next message
Lock pattern (v6.2) the agent fakes "user consent" markers (skip / closing) .harness/locks/* markers are written ONLY by hooks on an explicit phrase — an agent write is block
Config-protect (v6.2) the agent weakens its own gates (profile, heartbeat, disabling) block in all profiles; disabling enforcement is the user's manual action only
Handoff loop at session close the plan stays in the chat (the next session won't see it) inject a cold-start checklist + detect a missed handoff at startup
User rules (/hookify) "never do X again" is forgotten and repeated the human freezes a correction into a permanent block/warn rule, no code needed

F. Harness infrastructure

Mechanism What it does
Hooks out of the box hooks.json auto-loads on install; with no file you can't "forget to turn it on"
Warnings reach the model warnings travel on the correct channel (otherwise they'd be silently lost)
Profiles + version lifecycle minimal/standard/strict; legacy projects aren't forced, they migrate on command
Traceability table + self-check every mechanism is described by 3 attributes; a row without a live mechanism fails the self-check
Personal-data gate if anything personal slips into the public build (email / client project / private path) — block the self-check

Honest — what's still discipline, not a mechanism: checking cross-module wiring on the
real path, "the agent does it itself instead of sending you to the terminal," realistic test
data. A hook can't reliably force these. We keep them as discipline + catch them on real
projects. We don't pass them off as "bulletproof."

Built after auditing all ~20 real projects from earlier versions (12 retrospectives + ~150
memory notes + 6 bug journals); v6.2 followed an audit of 54 live sessions on v6.1 + harness
practice research + an independent critique of the plan.


7 subsystems

Instructions (CLAUDE.md routing + domain-rules.yaml) · State (feature_list.json +
SESSION.md + error-journal) · Verification (4 layers + dual critique + negative gate) ·
Scope (affected_files, WIP=1) · Lifecycle (init, cold-start, clean-exit, /upgrade) ·
Learning (feedback memory, retrospectives, anti-patterns) · Cost & Safety (bulk-gate,
concurrent-lock, secrets-scope).


Commands

Command What it does
/setup Onboarding: 6 simple questions → a portrait (how to talk to you)
/new-project Business interview + bootstrap the harness (4 files at start)
/resume <project> Cold-start test + diff against the previous session
/feature <id> WIP=1 + dual critique (test-researcher + user-perspective-critic)
/verify 4-layer verification (syntax + runtime + e2e + user)
/hookify "never do X again" → a permanent block/warn rule
/handoff · /end-session Clean exit + persist state into files
/audit External harness assessment + error rate
/stuck Stuck protocol + an LLM quorum
/ship Final validation ≥90% + retrospective
/research · /architecture · /dev-plan · /upgrade-project … (full list in skills/)

Pipeline

  • FAST (5 stages) — internal tools, simple MVPs, bots:
    interview → architecture + stack → design handoff (if UI) → /feature loop → /ship.
  • FULL (10 stages) — products going to market: ideas R1/R2 → validation → research →
    architecture + prototype → design → wave plan → /feature loop → /ship + marketing launch.

Install

Claude Code

# 1. Add the marketplace from GitHub
claude plugin marketplace add andrewcigan/vibe-dev-plugin
# 2. Install and enable the plugin
claude plugin install vibe-dev@vibe-dev

Or locally (for developing the plugin itself):

claude --plugin-dir "/path/to/vibe-dev-plugin"

In Claude Code you get the full harness: auto-loaded hooks, the slash commands above, and
profile-based strictness.

Codex

Codex reads AGENTS.md automatically. Point it at the harness:

git clone https://github.com/andrewcigan/vibe-dev-plugin
# then run Codex with the repo's AGENTS.md as your project rules

In Codex the harness drives the agent through AGENTS.md, the domain rules, the state files,
and the methodology — the same principles and workflow, applied as the agent's operating
instructions.

The plugin's technical id is vibe-dev (command names and install depend on it). Version: 6.2.1.


Version

v6.2.1 — Interrupt-recovery: a technical interruption (client disconnect / message delivery)
no longer paralyzes the agent into "waiting for instructions" — the next prompt without a
stop-word continues the plan automatically.

v6.2.0 — Enforcement as a provable fact (37 mechanisms): provable hook activation (heartbeat

  • two-phase profile + independent git pre-commit backstop + /doctor), fail-loud (a crashed
    guard can't stay silent), clarity gate on the final message, surface-aware evidence, mandatory
    research before architecture, closing mode, secret hygiene, config-protect. Every new guard was
    verified with live runs on the 2.1.170 engine. Built from an audit of 54 live v6.1 sessions.
    Full change list — CHANGELOG.md.

v6.1.0 — public release: enforcement from text into mechanism (20 mechanisms) + onboarding
(/setup) + personal-data gate, after an audit of ~20 real v5 projects.


Notes

  • The harness was built for, and currently converses in, Russian (its clarity gates and
    prompts are Russian-language). The methodology, mechanisms, and pipeline are
    language-agnostic; UI/interface localization is not done yet.
  • Author: Andrei Tsyhan.

Reviews (0)

No results found