babysitter
Babysitter enforces obedience to agentic workforces and enables them to manage extremely complex tasks and workflows through deterministic, hallucination-free self-orchestration
Babysitter
Enforce obedience to agentic workforces. Manage extremely complex workflows through deterministic, hallucination-free self-orchestration.
https://github.com/user-attachments/assets/8c3b0078-9396-48e8-aa43-5f40da30c20b
Table of Contents
- What is Babysitter?
- Prerequisites
- Installation
- First Steps
- Quick Start
- How It Works
- Why Babysitter?
- Documentation
- Contributing
- Community and Support
- License
What is Babysitter?
Babysitter enforces obedience to agentic workforces, enabling them to manage extremely complex tasks and workflows through deterministic, hallucination-free self-orchestration. Define your workflow in code - Babysitter enforces every step, ensures quality gates pass before progression, requires human approval at breakpoints, and records every decision in an immutable journal. Your agents do exactly what the process permits, nothing more.
Prerequisites
- Node.js: Version 20.0.0+ (22.x LTS recommended)
- Claude Code: Latest version (docs)
- Git: For cloning (optional)
Installation
1. Install the Plugin
claude plugin marketplace add a5c-ai/babysitter
claude plugin install --scope user [email protected]
Then restart Claude Code.
2. Verify Installation
Type /skills in Claude Code to verify "babysit" appears.
Codex CLI Integration (babysitter-codex)
Codex support is available as a dedicated plugin bundle in:
plugins/babysitter-codex
It includes Codex hook wiring, slash command dispatch, and orchestration harness scripts compatible with the Babysitter SDK.
First Steps
After installation, set up your environment:
1. Configure Your Profile (One-Time)
/babysitter:user-install
This creates your personal profile with:
- Breakpoint preferences (how much oversight you want)
- Tool preferences and communication style
- Expertise areas for better process matching
2. Set Up Your Project
/babysitter:project-install
This analyzes your codebase and configures:
- Project-specific workflows
- Test frameworks and CI/CD integration
- Tech stack preferences
3. Verify Setup
/babysitter:doctor
Run diagnostics to confirm everything is working.
Quick Start
claude "/babysitter:call implement user authentication with TDD"
Or in natural language:
Use the babysitter skill to implement user authentication with TDD
Claude will create an orchestration run, execute tasks step-by-step, handle quality checks and approvals, and continue until completion.
Choose Your Mode
| Mode | Command | When to Use |
|---|---|---|
| Interactive | /babysitter:call |
Learning, critical workflows - pauses for approval |
| Autonomous | /babysitter:yolo |
Trusted tasks - full auto, no breakpoints |
| Planning | /babysitter:plan |
Review process before executing |
| Continuous | /babysitter:forever |
Monitoring, periodic tasks - runs indefinitely |
Utility Commands
| Command | Purpose |
|---|---|
/babysitter:doctor |
Diagnose run health and issues |
/babysitter:observe |
Launch real-time monitoring dashboard |
/babysitter:resume |
Continue an interrupted run |
/babysitter:help |
Documentation and usage help |
How It Works
+=============================================================================+
| /babysitter:call |
+=============================================================================+
| |
| YOUR PROCESS (JavaScript) This is the AUTHORITY |
| +----------------------------------------+ |
| | async function process(inputs, ctx) { | Real code, not config. |
| | | The orchestrator can ONLY |
| | await ctx.task(plan, { ... }); | do what this code permits. |
| | | |
| | await ctx.breakpoint({ | Breakpoints = human gates |
| | question: 'Approve plan?' | (enforced, not optional) |
| | }); | |
| | | |
| | await ctx.task(implement, { ... }); | Tasks = executable work |
| | | |
| | const score = await ctx.task(verify);| Quality gates = code logic |
| | if (score < 80) | (not config, real checks) |
| | await ctx.task(refine, { ... }); | |
| | } | |
| +-------------------+--------------------+ |
| | |
| | governs |
| v |
| +---------------------------------------------------------------------+ |
| | ENFORCEMENT MECHANISM | |
| | | |
| | +-------------+ +------------------+ +-----------------+ | |
| | | MANDATORY |---->| PROCESS CHECK |---->| DECISION | | |
| | | STOP | | What does the | | | | |
| | | (enforced | | process permit | | Permitted: next | | |
| | | by hook) | | next? | | task assigned | | |
| | +-------------+ +------------------+ | | | |
| | | | Blocked: halt | | |
| | v | until gate | | |
| | +--------------+ | passes | | |
| | | Gate/task | +-----------------+ | |
| | | from code | | |
| | +--------------+ | |
| +---------------------------------------------------------------------+ |
| | |
| | records every decision |
| v |
| +---------------------------------------------------------------------+ |
| | JOURNAL: Every task, gate, decision - immutable, replayable | |
| +---------------------------------------------------------------------+ |
| |
+=============================================================================+
The difference from simple iteration:
- Process as Code: Your workflow is JavaScript - the orchestrator can ONLY do what this code permits
- Mandatory Stop: Claude cannot "keep running" - every step ends with a forced stop, then the process decides what's next
- Enforcement, not Assistance: Gates block progression until satisfied - they're not suggestions
- Event-Sourced Journal: All state in
.a5c/runs/- deterministic replay and resume from any point
Why Babysitter?
| Traditional Approach | Babysitter |
|---|---|
| Run script once, hope it works | Process enforces quality gates before completion |
| Manual approval via chat | Structured breakpoints with context |
| State lost on session end | Event-sourced, fully resumable |
| Single task execution | Parallel execution, dependencies |
| No audit trail | Complete journal of all events |
| Ad-hoc workflow | Deterministic, code-defined processes |
Key differentiators: Process enforcement, deterministic replay, quality convergence, human-in-the-loop breakpoints, and parallel execution.
Documentation
Getting Started
Features
- Process Library - 2,000+ pre-built processes
- Process Definitions
- Quality Convergence
- Run Resumption
- Journal System
- Best Practices
- Architecture Overview
Reference
Contributing
We welcome contributions! Here's how you can help:
- Report bugs: GitHub Issues
- Suggest features: Share your ideas for improvements
- Submit pull requests: Fix bugs or add features
- Improve documentation: Help make docs clearer
See CONTRIBUTING.md for detailed guidelines.
Community and Support
- Discord: Join our community (GitHub invite link)
- GitHub Issues: Report bugs or request features
- GitHub Discussions: Ask questions and share ideas
- npm: @a5c-ai/babysitter-sdk
Community Tools
| Tool | Description |
|---|---|
| Observer Dashboard | Real-time monitoring UI for parallel runs |
| Telegram Bot | Control sessions remotely |
| vibe-kanban | Parallel process management |
Star History
Contributors
License
This project is licensed under the MIT License. See LICENSE.md for details.
Compression
Babysitter includes a 4-layer token compression subsystem (built into packages/sdk/) that reduces context window usage by 50–67% on real sessions while maintaining 99% fact retention.
All compression hooks are automatically registered by the babysitter plugin — no manual settings.json configuration needed. Install the plugin and compression is active.
How It Works
| Layer | Hook | Engine | Content | Reduction |
|---|---|---|---|---|
| 1a | userPromptHook | density-filter | User prompts | ~29% |
| 1b | commandOutputHook | command-compressor | Bash/shell output | ~47% avg |
| 2 | sdkContextHook | sentence-extractor | Agent/task context | ~87% |
| 3 | processLibraryCache | sentence-extractor | Library files (pre-cached) | ~94% |
Quick Toggle
# Disable all compression
export BABYSITTER_COMPRESSION_ENABLED=false
# Disable a single layer
babysitter compression:toggle sdkContextHook off
# Show current effective config
babysitter compression:config
Config File
Edit .a5c/compression.config.json to persist settings (env vars always take priority):
{
"enabled": true,
"layers": {
"userPromptHook": { "enabled": true, "threshold": 500, "keepRatio": 0.78 },
"commandOutputHook": { "enabled": true, "excludeCommands": ["jq", "curl", "docker"] },
"sdkContextHook": { "enabled": true, "targetReduction": 0.15, "minCompressionTokens": 150 },
"processLibraryCache": { "enabled": true, "targetReduction": 0.35, "ttlHours": 24 }
}
}
Toggle any layer with babysitter compression:toggle <layer> <on|off> or set individual values with babysitter compression:set <key> <value>.
Built with Claude by A5C AI
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found