ViviDoc

Turn any topic into an explorable explanation.

Live Demo · Paper (arXiv) · PDF · Quick Start

ViviDoc Showcase — interactive educational documents across 10 domains

What is ViviDoc?

ViviDoc is an LLM-powered pipeline that generates self-contained interactive HTML documents — explorable explanations — from a single topic input. Given a topic, ViviDoc designs a purpose-built visual style, plans a structured document using the SRTC Interaction Spec, and writes a single HTML file with explanatory text, KaTeX math, and interactive Canvas visualizations that open in any browser with no server.

The key insight is the SRTC specification — a four-field interaction design language (State · Render · Transition · Constraint) that separates what the learner should discover from how it's rendered. This allows an LLM to reason about pedagogy before touching code.

Accepted at ACL 2026 System Demonstrations — arXiv:2603.27991

✨ Features

Zero-dependency output — Each document is a single .html file with embedded CSS and JS. No build step, no server. Open it in a browser.
Purpose-built visual style — ViviDoc reasons about the topic's emotional register and domain conventions (physics → monospace + dark; biology → organic + warm) to synthesize a custom visual identity per document.
Structured interaction design — The SRTC spec (State · Render · Transition · Constraint) grounds every visualization in a pedagogical invariant — the one thing the learner must discover.
8 interaction categories — Grounded in empirical analysis of 482 interaction instances across 101 real-world explorable explanations (ViviBench).
Two usage modes — Interactive Claude Code skill (/vividoc) for zero-setup generation, or CLI pipeline for batch generation and benchmarking.
Extensible template library — Add reference cases with /vividoc-learn <url> to distill real explorable explanations into reusable SRTC templates.

🚀 Quick Start

Option A: Claude Code skill (recommended — zero setup)

Open this repository in Claude Code. No API key needed — Claude Code is the model.

/vividoc Fourier Transform

Claude Code reasons about the topic, proposes a visual style, designs SRTC interactions, and writes the document directly. Output: outputs/fourier_transform/document.html

/vividoc-learn https://ncase.me/trust/

Fetches the page, extracts its interaction patterns and visual style into SRTC format, and saves a reusable template to benchmark/datasets/interaction_examples/.

Option B: CLI pipeline

# Install
pip install uv && uv sync

# Set API key (OpenRouter covers most models)
export OPENROUTER_API_KEY="sk-or-..."

# Generate a document
vividoc run "Fourier Transform" openrouter/google/gemini-2.5-pro
# → outputs/fourier_transform/vividoc_gemini-2.5-pro/document.html

# Stage-by-stage
vividoc plan "Fourier Transform" openrouter/google/gemini-2.5-pro -o spec.json
vividoc exec spec.json openrouter/google/gemini-2.5-pro

# With style guidance
vividoc run "Fourier Transform" openrouter/google/gemini-2.5-pro \
  --text-style "Conversational, concrete analogies" \
  --interaction-style "Dark background, neon accents, physics aesthetic"

Supported models: any openrouter/<provider>/<model> string, or anthropic/claude-* with ANTHROPIC_API_KEY.

🧠 How It Works

ViviDoc decomposes document generation into three stages:

Topic (string)
    │
    ▼  Plan
┌─────────────────────────────┐
│  Planner                    │
│  LLM → DocumentSpec         │  spec.json
│  (SRTC per knowledge unit)  │
└─────────────┬───────────────┘
              │
    ▼  Execute (per section)
┌─────────────────────────────┐
│  Executor                   │
│  Stage 1: text + KaTeX      │  HTML fragments
│  Stage 2: JS + Canvas viz   │
└─────────────┬───────────────┘
              │
    ▼  Evaluate
┌─────────────────────────────┐
│  Evaluator                  │
│  Coherence + render check   │  document.html
└─────────────────────────────┘

The SRTC Interaction Spec

Every knowledge unit has an interaction_spec with four fields:

Field	Role
S (State)	Variables the user controls or that are derived
R (Render)	List of visual elements to display
T (Transition)	Cause → effect rules (`[]` = static, no interaction needed)
C (Constraint)	The pedagogical invariant the learner must discover

The constraint is the design target: every visual element should be built to make it unmissable.

Interaction is not mandatory. If T = [], the executor creates a beautiful static or auto-animated visualization — static is sometimes the right answer.

🎛️ Interaction Taxonomy

ViviDoc's interaction design is grounded in 482 interaction instances across 101 real-world explorable explanations from 63 websites and 11 domains (ViviBench dataset):

#	Category	When to use	Example
1	Parameter Exploration	Continuous variable has a nonlinear effect	Lorenz Attractor — σ, ρ sliders
2	State Switching	Discrete modes produce qualitatively different results	Quantum Orbitals — 1s / 2p / 3d
3	Direct Manipulation	Dragging objects; spatial relationships are the concept	Geometric Optics — drag lens/object
4	Freeform Construction	Build structure to observe emergent behavior	Neural Network — click-to-place neurons
5	Temporal Control	Concept has a time dimension; play/pause/scrub	Fourier Epicycles — play + harmonic slider
6	Inspection	Spatial structure revealed by hovering	Voronoi — hover highlights cell
7	Spatial Navigation	Inherently 3D; rotate/pan/zoom	Möbius Strip — drag to rotate 3D mesh
8	Scroll-driven Narrative	Linear progression reveals the concept	Entropy — scroll removes wall, particles mix

Reference implementations (self-contained HTML + SRTC spec + style guide) for all 8 categories: benchmark/datasets/interaction_examples/

📚 Showcase

→ Browse all documents at vividoc.vercel.app

26 hand-verified documents across 10 domains, each generated by ViviDoc and reviewed for pedagogical accuracy:

Physics & Mathematics (3 documents)

Document	Interaction	Key Concept
Fourier Transform	Temporal Control	Epicycles, Gibbs phenomenon, signal decomposition
Lorenz Attractor	Parameter Exploration	Sensitive dependence, strange attractor, butterfly effect
Bézier Curves	Direct Manipulation	De Casteljau construction, convex hull property

Computer Science (3 documents)

Document	Interaction	Key Concept
Merge Sort	Temporal Control	Divide-and-conquer, O(n log n) guarantee
Neural Network	Freeform Construction	Forward propagation, activation functions
Voronoi Tessellation	Inspection	Nearest-neighbor partitioning, Delaunay duality

Biology & Medicine (3 documents)

Document	Interaction	Key Concept
Action Potential	Temporal Control	Hodgkin-Huxley ion channels, all-or-nothing threshold
DNA Replication	Temporal Control	Helicase, polymerase fidelity, Okazaki fragments
Geometric Optics	Direct Manipulation	Thin lens equation, real/virtual images

Statistics & Probability (3 documents)

Document	Interaction	Key Concept
Central Limit Theorem	Parameter Exploration	Convergence to normal, sample size effect
Quantum Orbitals	State Switching	Wave functions, electron probability clouds
Entropy	Scroll-driven Narrative	Thermodynamic irreversibility, Maxwell's demon

Economics & Game Theory (3 documents)

Document	Interaction	Key Concept
Supply & Demand	Direct Manipulation	Equilibrium, elasticity, deadweight loss
Black–Scholes	Parameter Exploration	Option pricing, the Greeks, IV smile
Prisoner's Dilemma	State Switching	Dominant strategy, iterated games, replicator dynamics

Mechanics & Waves (3 documents)

Document	Interaction	Key Concept
Double Pendulum	Direct Manipulation	Chaos, Lyapunov exponent, Poincaré section
Wave Interference	Parameter Exploration	Superposition, beats, standing waves
Spring–Mass System	Parameter Exploration	Damping regimes, resonance, phase lag

Chemistry (3 documents)

Document	Interaction	Key Concept
Reaction Kinetics	Parameter Exploration	Rate laws, Arrhenius equation, enzyme saturation
Molecular Orbitals	State Switching	Bonding vs antibonding, MO diagrams, HOMO-LUMO gap
Acid–Base Chemistry	Parameter Exploration	Henderson-Hasselbalch, titration curves, buffer capacity

Information Theory (3 documents)

Document	Interaction	Key Concept
Shannon Entropy	Parameter Exploration	Information content, source coding theorem
Huffman Coding	Freeform Construction	Prefix-free codes, optimal compression, entropy bound
Channel Capacity	Parameter Exploration	Shannon-Hartley theorem, AWGN channel, BSC

Machine Learning (2 documents)

Document	Interaction	Key Concept
Gradient Descent	Direct Manipulation	Loss landscapes, optimizers (SGD/Adam), learning rate schedules
Bias–Variance Tradeoff	Parameter Exploration	Overfitting, regularization, model complexity

Relativity & Spacetime (2 documents)

Document	Interaction	Key Concept
Time Dilation	Parameter Exploration	Lorentz factor, twin paradox, GPS corrections
Lorentz Transformation	Direct Manipulation	Minkowski diagram, simultaneity, length contraction

🗂️ Repository Structure

.claude/commands/        # Claude Code skills: /vividoc and /vividoc-learn
prompts/                 # LLM prompt templates (planner, executor, evaluator, styler, video)
vividoc/
├── core/                # Pipeline stages: planner, executor, evaluator, runner, styler
│   ├── video_codegen.py # Video generation: DocumentSpec → Manim scenes
│   └── narration_gen.py # Narration synthesis from SRTC T-field keyframes
├── utils/llm/           # LLM client + provider adapters (OpenRouter, Anthropic)
└── cli.py               # CLI entry points (run, plan, exec, eval, video)
benchmark/
├── datasets/
│   ├── interaction_examples/   # 8 reference cases (HTML + SRTC spec + style guide)
│   └── prepped/                # ViviBench — 101-topic evaluation dataset
├── baselines/                  # AutoGen, CAMEL, MetaGPT, naive baselines
└── evals/                      # Automated evaluation scripts
frontend/                # React showcase (vividoc.vercel.app)
docs/                    # Design docs, video generation roadmap
examples/                # Standalone demos (Manim video generation)

🔬 Development

uv sync --dev

# Run tests
uv run pytest

# Lint
uv run ruff check . && uv run ruff format .

# Run benchmark evaluation
uv run python benchmark/run.py

# Serve showcase locally
cd frontend && npm install && npm run dev

Adding a new LLM provider

Create vividoc/utils/llm/callers/<provider>_caller.py implementing LLMCaller
Register it in vividoc/utils/llm/caller_registry.py

Adding a reference case

/vividoc-learn https://example.com/interactive-page my-case-name

Saves to benchmark/datasets/interaction_examples/my-case-name/ with SRTC spec, HTML, and style notes. Immediately available as a template for future /vividoc runs.

📄 Citation

If you use ViviDoc in your research, please cite:

@inproceedings{tang2026vividoc,
  title     = {{ViviDoc}: Generating Interactive Documents through Human-Agent Collaboration},
  author    = {Tang, Yinghao and Xie, Yupeng and Feng, Yingchaojie and Lan, Tingfeng and Lao, Jiale and Cheng, Yue and Chen, Wei},
  booktitle = {Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics: System Demonstrations},
  year      = {2026},
  url       = {https://arxiv.org/abs/2603.27991}
}

MIT License · ACL 2026 System Demonstrations · arXiv:2603.27991