Universal Planning Framework

universal-planning-framework

Plans fail because discovery happens too late. This framework catches gaps that only surface during execution - evolved from 117 real plans + 195 handoffs.

"Initial idea was a custom booking system (8 weeks). Stage 0 discovered Calendly + Stripe does 90% of it. Shipped in 3 weeks, saved 5 weeks of engineering."

from Business Launch Example

Quick Install

Full install (recommended - gives you commands, agent, and rule):

git clone https://github.com/primeline-ai/universal-planning-framework
cp -r universal-planning-framework/.claude/* your-project/.claude/

Minimal install (rule file only):

mkdir -p .claude/rules
curl -o .claude/rules/universal-planning.md \
  https://raw.githubusercontent.com/primeline-ai/universal-planning-framework/main/.claude/rules/universal-planning.md

What Makes This Different

Formal reasoning foundation (DSV). Most plans fail not because validation is missing, but because the wrong question gets validated. The framework is built on Decompose-Suspend-Validate - a principle that forces you to challenge your interpretation before committing to it. See Theoretical Foundation below.

FAILED conditions are mandatory. Every plan must define when to kill the project - not just when it succeeds. No FAILED condition = zombie project (anti-pattern #11).

21 anti-patterns with detection rules. Not guidelines - specific detection rules that catch vague gates ("looks good"), hallucinated estimates, assumed facts, discovery amnesia, and 17 more. Categorized as 12 Core + 5 AI-Specific + 4 Quality.

Autonomous hardening (Stage 1.5). Run /plan-refine and 6 adversarial perspectives stress-test your plan without asking you a single question. The Pedantic Lawyer catches vague gates. The Devil's Advocate challenges your core assumption. You get back a hardened plan with a change log.

Quality rubric with teeth. Grade C/B/A based on objective criteria. Numbers need sources. Gates must be observable. Delegated work needs input/output specs. FAILED conditions must be measurable.

Behavior specs + acceptance criteria. Optional Behavior Description (tech-agnostic, what the feature does) and Given/When/Then acceptance criteria give plans testable behavior coverage without forcing a separate spec document.

Reference Library. For coding domains, plans must link the official docs they consulted - so the person maintaining the output has the same sources.

8 domain detection. Software, AI/Agent, Business, Content, Infrastructure, Data & Analytics, Research, Multi-Domain. Each domain auto-includes relevant sections.

5-Minute Quick Start

1. Create a plan

/plan-new "Add OAuth login to Next.js app"

Claude runs through 3 stages:

Stage 0: Discovers existing work, checks feasibility, challenges your approach (AHA Effect)
Stage 1: Builds the plan with 5 CORE sections, domain-specific CONDITIONAL sections, and a confidence level
Stage 1.5: Autonomously hardens the plan from 6 adversarial perspectives
Stage 2: Meta-checks including Cold Start Test and Discovery Consolidation

2. Interview an existing plan

/interview-plan path/to/plan.md

Framework-aware questions across 3 tiers: critical gaps, domain-specific probes, quality strengthening. Includes DSV checks for premature commitment and assumption mutation. References anti-patterns by number.

3. Review plan quality

/plan-review path/to/plan.md

Objective assessment against the rubric. Returns grade, anti-patterns found, and top 3 improvements.

4. Harden a plan autonomously

/plan-refine path/to/plan.md

6 perspectives stress-test the plan. Fixes structural issues, flags strategic decisions for you.

Theoretical Foundation: DSV

The framework is built on Decompose-Suspend-Validate (DSV) - a reasoning principle that prevents premature commitment at every stage of planning.

Most planning failures share a root cause: the planner validates an assumption without first questioning whether it's the right assumption. You check "Can we build X in 4 weeks?" when the real question is "Should we build X at all?" DSV prevents this by structuring thought into three phases:

Phase	What it does	Framework mapping
Decompose	Break the problem into discrete, testable claims	Stage 0 checks 0.1-0.6 (Existing Work, Facts, Docs, Updates, Practices, Research)
Suspend	Challenge each claim - explore alternative interpretations before committing	Stage 0 checks 0.7-0.12 (Feasibility, ROI, AHA Effect, Competitive, Constraints, People Risk)
Validate	Test each claim independently with explicit methods and failure impacts	Stage 1 Assumptions section (`VALIDATE BY` + `IMPACT IF WRONG`)

The key insight is Suspend. Decomposing is natural. Validating is expected. But actively suspending your first interpretation - asking "what if this means something entirely different?" - is what most planners skip. That's why Stage 0's sparring checks (0.7-0.12) exist.

Quick DSV (3 questions, 30 seconds)

For time-pressured situations, DSV compresses to three questions:

"What are the 2-3 key claims?" (Decompose)
"What alternative interpretation haven't I considered?" (Suspend)
"Which claim am I least sure about?" (Validate that one first)

This works standalone - even without the full framework. Use it before any decision where you catch yourself feeling "obvious."

The Framework

Stage 0: Discovery (Before Planning)

12 checks in 3 priority tiers. The agent decides which to run based on context. Maps directly to the DSV phases above: checks 0.1-0.6 decompose, 0.7-0.12 suspend, Stage 1 Assumptions validates.

Tier	Checks
Always	Existing Work, Feasibility, Better Alternatives (AHA Effect)
Usually	Factual Verification, Official Docs, ROI
Context-dependent	Updates, Best Practices, Deep Research, Competitive, Constraints, People Risk

The AHA Effect (Check 0.9) is the single most valuable check:

Custom CMS planned? "Strapi does 90% of it."
50 blog posts? "5 pillar posts + derivatives might outperform."
Building from scratch? "This open-source project does 80%."

Stage 1: The Plan

5 CORE sections (always required): Context & Why (+ optional Behavior Description), Success Criteria (with FAILED conditions + optional GWT acceptance criteria), Assumptions (with VALIDATE BY + IMPACT IF WRONG), Phases (with binary gates + review checkpoints), Verification (Automated + Manual + Ongoing Observability).

Optional structured formats: Behavior Description captures what a feature DOES in tech-agnostic language (3-5 sentences). Given/When/Then acceptance criteria complement FAILED conditions — FAILED covers what must NOT happen, GWT covers what MUST happen.

18 CONDITIONAL sections (domain-detected): Rollback, Risk, Post-Completion, Budget, User Validation, Legal, Security, Resume Protocol, Incremental Delivery, Delegation, Dependencies, Related Work, Timeline, Stakeholders, Reference Library, Learning & Knowledge Capture, Feedback Architecture, Completion Gate.

Coding domains size phases by scope (files, features, tests), not hours. Non-coding domains use time estimates as rough guides.

Plan Confidence Level: High / Medium / Low - assigned at plan header. Low confidence = Phase 1 must be a validation sprint.

Stage 1.5: Autonomous Hardening (Optional)

6 adversarial perspectives stress-test the plan:

Outside Observer - Goal clarity, End State, ambiguous metrics
Pessimistic Risk Assessor - Single failure points, FAILED condition timeouts
Pedantic Lawyer - Vague gates, delegation contracts, deployment completeness
Skeptical Implementer - First blocker, unverified facts, cold start readiness
The Manager - Resume Protocol, scope realism, deadline acknowledgment
Devil's Advocate - Core assumption validity, 80/20 path, obsolescence risk

Structural fixes applied in place. Strategic decisions flagged as [Stage 1.5 Note:] for user review. Hardening Log appended as audit trail.

Stage 2: Meta (7 Checks)

Delegation Strategy, Research Needs, Review Gates, Anti-Pattern Check (21), Cold Start Test, Plan Hygiene Protocol, Discovery Consolidation.

Domain Detection

Domain	Key Sections	Phase Sizing	Review Frequency
Software Development	Rollback, Risk, Delegation, Reference Library	Scope-based	Every 2 phases
Multi-Agent / AI	Risk, Delegation, Security, Reference Library	Scope-based	Every 2 phases
Business / Strategy	Timeline, Budget, Stakeholders, Validation	Time-based	Per milestone
Content / Marketing	Timeline, Validation, Legal, Feedback Architecture	Time-based	Per draft
Infrastructure / DevOps	Rollback, Risk, Dependencies, Reference Library	Mixed	Every phase
Data & Analytics	Risk, Rollback, Legal, Security, Reference Library	Mixed	Every phase
Research / Exploration	Incremental, Budget, Related Work, Learning	Time-based	Per finding
Multi-Domain	Union of matched domains	Most conservative	Most conservative

Quality Rubric

Grade	Criteria
C (Viable)	All 5 CORE + 1 CONDITIONAL. No critical anti-patterns (#3, #9, #11, #20, #21).
B (Solid)	C + Stage 0 + FAILED conditions + Confidence Level + Cold Start Test. Zero anti-patterns.
A (Excellent)	B + sparring (0.7-0.9) + Review Checkpoints + Reference Library (coding) + Replanning triggers.
Red Flags	Vague criteria, no FAILED conditions, assumptions untested, numbers without sources, delegated work without specs. Fix before implementing.

When to Activate

Use for: New features, architecture changes (3+ files), multi-phase projects, anything with external dependencies.

Skip when ALL true: Single file, no external dependencies, <50 lines, no schema change, no user-facing change, rollback = git revert.

Philosophy

Traditional planning: Goal - Approach - Steps - Execute - "Oh crap, we didn't consider X."

This framework inverts it: Discovery - Constraints - Assumptions - THEN plan.

Stage 0 is where you find out your "simple feature" touches 6 systems, the existing code does 70% of what you need, your timeline was off by 3x, and there's a legal requirement you didn't know about.

The DSV principle explains why this works: most "overlooked" requirements weren't missing from the problem space - they were missing from the planner's interpretation of it. By forcing decomposition and suspension before validation, the framework catches gaps at the cheapest possible moment: before a single line of the plan is written.

The framework is domain-agnostic. Use it for software features, business launches, content creation, data pipelines, research projects - anything that needs a plan.

Examples

Example	Domain	Grade	Key Features
Business Launch	Business / Strategy	A	AHA Effect saved 6 weeks, 8 sub-phases, full CONDITIONAL suite
Content Creation	Content / Marketing	B	11 sub-phases, code examples, Resume Protocol
Software Feature	Software Development	B	OAuth implementation, Security section, Reference Library
Infrastructure CI/CD	Infrastructure / DevOps	A	Resume Protocol, Rollback, Reference Library, Review Checkpoints

Contributing

See CONTRIBUTING.md for guidelines. Key requirements:

Example plans must pass their own anti-pattern self-audit honestly
Use grades C/B/A only (no B+, A-, etc.)
Coding examples must include a Reference Library

Found a gap the framework doesn't catch? Open an issue.

The Ecosystem

UPF is one piece of a progression. Each tier works independently - no hard dependencies.

You're here          You want this            Install this
-----------          -------------            ------------
Raw Claude Code  ->  Session memory       ->  Starter System (free)
                 ->  Workflow skills      ->  + Skills Bundle (free)
                 ->  Deep planning        ->  + UPF (free) <- you are here
                 ->  Deep analysis        ->  + Quantum Lens (free)
                 ->  AI-powered system    ->  + Course (paid)

Component	What It Does	Link
Starter System	Session memory, handoffs, context awareness	GitHub
Skills Bundle	5 workflow skills: debugging, delegation, planning, code review, config architecture	GitHub
UPF	Universal Planning Framework with deep multi-stage planning	You're reading it
Quantum Lens	Multi-perspective analysis + solution engineering (7 cognitive lenses)	GitHub
Course	Kairn + Synapse: AI-powered memory and knowledge graphs	primeline.cc

The Skills Bundle includes a lightweight plan-and-execute skill for everyday planning. UPF is the deep version - use it when the stakes are high enough to justify Stage 0 discovery.

License

MIT License - see LICENSE file.

Part of the PrimeLine Ecosystem

Tool	What It Does	Deep Dive
Evolving Lite	Self-improving Claude Code plugin — memory, delegation, self-correction	Blog
Kairn	Persistent knowledge graph with context routing for AI	Blog
tmux Orchestration	Parallel Claude Code sessions with heartbeat monitoring	Blog
UPF	3-stage planning with adversarial hardening	Blog
Quantum Lens	7 cognitive lenses for multi-perspective analysis	Blog
PrimeLine Skills	5 production-grade workflow skills for Claude Code	Blog
Starter System	Lightweight session memory and handoffs	Blog

@PrimeLineAI · primeline.cc · Free Guide