statsclaw

agent
Guvenlik Denetimi
Uyari
Health Uyari
  • License — License: MIT
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Low visibility — Only 6 GitHub stars
Code Gecti
  • Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Gecti
  • Permissions — No dangerous permissions requested
Purpose
This framework uses a coordinated team of specialized AI agents to help researchers and developers build, test, and document statistical software packages.

Security Assessment
The overall security risk is Low. A light code scan of 12 files found no dangerous patterns, hardcoded secrets, or requests for risky permissions. While the system delegates tasks like writing code and pushing commits (via the "Shipper" agent) to AI models, it acts as a workflow orchestrator rather than a standalone background process. The framework emphasizes keeping a human domain expert in the loop to guide these actions. Standard network requests are primarily made to your AI provider (such as Claude), meaning no suspicious or hidden external network activities were detected.

Quality Assessment
The project is actively maintained, with its most recent repository push occurring today. It utilizes the permissive MIT license and includes clear documentation and contribution guidelines. However, community trust and visibility are currently very low. With only 6 GitHub stars, the tool is highly experimental and likely in its early stages. This means it probably lacks a large user base to independently test the software and find edge-case bugs.

Verdict
Use with caution — the tool appears safe from a code standpoint, but its early-stage, low-visibility nature makes it highly experimental for critical workflows.
SUMMARY

A workflow framework for statistical package development

README.md

StatsClaw

A workflow framework for statistical package development.

An open-source tool that helps researchers build, test, and document statistical software packages with AI agent teams.

Website · Roadmap · Contributing · Discussions


What is StatsClaw?

StatsClaw is a framework for Claude Code that uses AI agent teams to assist with statistical package development. You describe what you need — a bug fix, a new feature, a cross-language translation — and StatsClaw coordinates multiple AI agents to help you build, test, and document the result. It works best when a domain expert stays in the loop to guide decisions.


How It Works

StatsClaw orchestrates a team of 9 specialized AI agents, each operating under strict information isolation:

Agent Role
Leader Orchestrates the workflow, dispatches agents, enforces isolation
Planner Reads your paper/formulas, executes deep comprehension protocol, produces specifications
Builder Writes source code from spec.md (never sees the test spec)
Tester Validates independently from test-spec.md (never sees the code spec)
Simulator Runs Monte Carlo studies from sim-spec.md (never sees either spec)
Scriber Documents architecture, generates tutorials, maintains audit trail
Distiller Extracts reusable knowledge for the shared brain (brain mode only)
Reviewer Cross-checks all pipelines, audits tolerance integrity, issues ship/no-ship verdict
Shipper Commits, pushes, opens PRs, handles package distribution

The code, test, and simulation pipelines are fully isolated — they never see each other's specs. If all pipelines converge independently, confidence in correctness is high. This is adversarial verification by design.


Multi-Pipeline Architecture

                      planner (bridge)
                     /    |          \
          spec.md   / test-spec.md    \  sim-spec.md
                   /      |            \
            builder ─ ─(parallel)─ ─ simulator
       (code pipeline)    |    (simulation pipeline)
                   \      |            /
      implementation.md   |   simulation.md
                    \     |          /
                     \    v         /
                       tester           <-- sequential, after merge-back
                    (test pipeline)
                         |
                      audit.md
                         |
                    scriber (recording)
                         |
                    distiller (brain mode only)
                         |
                    reviewer (convergence)
                          |
                        shipper

Key properties:

  • Planner is always mandatory — it bridges all pipelines
  • Builder handles code, scriber handles docs, simulator handles Monte Carlo studies — for docs-only requests, scriber replaces builder as implementer
  • Builder and simulator run in parallel (simulation workflows), then tester validates the merged result — each pipeline has its own isolated spec
  • Pipeline isolation is enforced — each pipeline never sees another's spec
  • Adversarial verification — if all pipelines converge independently, confidence is high

Supported Languages

R Python Stata TypeScript Go Rust C C++

More languages coming — Julia is next! Want another? Let us know.


Quick Start

Prerequisites

  1. Claude CodeInstall Claude Code
  2. GitHub access — Push access to your target repository
  3. Workspace repo — A GitHub repo for storing workflow artifacts (auto-created if needed)

Your First Task

Just tell StatsClaw what you want. It auto-detects the language, selects the right workflow, and starts working:

work on https://github.com/your-org/your-package resolve the issues

StatsClaw will auto-detect the language, select a workflow, and start working. It will ask you clarification questions when it encounters ambiguity — your domain expertise guides the process. Results vary depending on task complexity; expect to iterate.


Workflow

Code:            leader → planner → builder → tester → scriber → [distiller]? → reviewer → shipper?
Docs-only:       leader → planner → scriber → reviewer → shipper?
Simulation+Code: leader → planner → [builder ∥ simulator] → tester → scriber → [distiller]? → reviewer → shipper?
Simulation-only: leader → planner → simulator → tester → scriber → [distiller]? → reviewer → shipper?

States: CREDENTIALS_VERIFIED → NEW → PLANNED → SPEC_READY → PIPELINES_COMPLETE → DOCUMENTED → [KNOWLEDGE_EXTRACTED]? → REVIEW_PASSED → READY_TO_SHIP → DONE

Signals: HOLD (ambiguous, ask user), BLOCK (validation failed), STOP (unsafe to ship)


What Can StatsClaw Help With?

Task How it helps Limitations
Implementing methods Assists with translating specs into code Requires researcher to validate mathematical correctness
Cross-language translation Handles R/Python idiom differences May miss subtle numerical edge cases without careful review
Testing & validation Independent test pipeline catches bugs tests miss Empirical verification, not formal proofs
Monte Carlo studies Automates simulation harness and reporting Researcher must design meaningful DGPs and metrics
Paper-driven features Reads methodology papers to design new functionality Extracts concepts, not full estimator implementations
Bug fixing Adversarial architecture helps find hidden bugs Complex domain bugs still need human insight
Documentation Generates Quarto books, API docs Needs researcher review for accuracy

Example Prompts

# Fix a specific issue
fix issue #42 in my-package

# Build from scratch
build a Python package from this R code

# Cross-language migration
rewrite the Python backends in pure R and ship it

# Simulation study
run a Monte Carlo study comparing these three estimators

# Paper to package
build the R works from this PDF

# Paper-driven feature
read Correia (2016) and add network visualization to panelView

# Documentation
update the documentation for v2.0

# Contribute knowledge to the shared brain
/contribute

Learn by Example

We provide examples from our own usage. Each is a real repository you can inspect and learn from. Your mileage may vary — these represent what worked for us with active researcher involvement.

Example Repo What it demonstrates
Iterative refactoring (1 to 2) statsclaw/example-fect Multi-day, researcher-guided refactoring of an R package
Python from R source (0 to 1) statsclaw/example-R2PY Building a Python package from an R reference
Paper to package + Monte Carlo statsclaw/example-probit PDF manuscript to R/C++ package + simulation
Paper-driven feature addition statsclaw/example-panelView Reading a methodology paper to design a new feature

See the workspace example for the actual workflow artifacts produced during these examples.


What You Install

  • CLAUDE.md — orchestration policy (the authoritative reference)
  • agents/ — agent definitions (leader, planner, builder, tester, simulator, scriber, distiller, reviewer, shipper)
  • skills/ — shared protocol skills (credential-setup, isolation, handoff, mailbox, issue-patrol, profile-detection, brain-sync, privacy-scrub)
  • profiles/ — language-specific execution rules (R, Python, TypeScript, Stata, Go, Rust, C, C++)
  • templates/ — runtime artifact templates and repo scaffolding (brain-repo, brain-seedbank-repo)

Agent Teams is enabled at the project level through .claude/settings.json.


Runtime Layout

All runtime state lives inside the workspace repo, organized per target repository:

.repos/
├── <target-repo>/                    # target repo checkout
├── brain/                            # statsclaw/brain clone (brain mode only)
├── brain-seedbank/                   # statsclaw/brain-seedbank clone (brain mode only)
└── workspace/                        # workspace repo (GitHub)
    └── <repo-name>/                  # per-target-repo runtime + logs
        ├── context.md                # active project context
        ├── CHANGELOG.md              # timeline index of all runs (pushed)
        ├── HANDOFF.md                # active handoff (pushed)
        ├── ref/                      # reference docs for future work (pushed)
        ├── runs/
        │   └── <request-id>/         # per-run artifacts
        │       ├── credentials.md    # push access verification
        │       ├── request.md        # scope and acceptance criteria
        │       ├── status.md         # state machine
        │       ├── impact.md         # affected files and risk areas
        │       ├── comprehension.md  # comprehension verification (from planner)
        │       ├── spec.md           # code pipeline input (from planner)
        │       ├── test-spec.md      # test pipeline input (from planner)
        │       ├── sim-spec.md       # simulation pipeline input (from planner, workflows 11/12)
        │       ├── implementation.md # code pipeline output (from builder)
        │       ├── simulation.md     # simulation pipeline output (from simulator, workflows 11/12)
        │       ├── audit.md          # test pipeline output (from tester)
        │       ├── ARCHITECTURE.md   # from scriber (primary copy in target repo root)
        │       ├── log-entry.md      # process record (from scriber; promoted to runs/<date>-<slug>.md)
        │       ├── docs.md           # documentation changes (from scriber)
        │       ├── brain-contributions.md  # knowledge entries (from distiller, brain mode only)
        │       ├── review.md         # convergence verdict (from reviewer)
        │       ├── shipper.md        # ship actions (from shipper)
        │       ├── mailbox.md        # inter-teammate communication
        │       └── locks/            # write surface locks
        ├── logs/                     # diagnostic logs
        └── tmp/                      # transient data

Repository Layout

StatsClaw/
├── CLAUDE.md           # orchestration policy
├── README.md
├── agents/             # agent definitions (9 agents including distiller)
├── skills/             # shared protocol skills (13 skills including brain-sync, privacy-scrub)
├── profiles/           # language execution rules (8 languages)
├── templates/          # runtime artifact templates + repo scaffolding (brain-repo, brain-seedbank-repo)
└── .repos/             # target repo checkouts + workspace + brain repos (runtime state, git-ignored)

Workspace Repository

Workflow logs, process records, and handoff documents are NOT stored in target repos. Instead, they are synced to a user-specified workspace repository on GitHub (e.g., [username]/workspace):

workspace/
├── fect/
│   ├── CHANGELOG.md                # timeline index
│   ├── HANDOFF.md                  # active handoff
│   ├── ref/                        # reference docs for future work
│   │   └── cv-comparison-table.md
│   └── runs/                       # individual workflow logs
│       ├── 2026-03-16-cv-unification.md
│       └── 2026-03-17-convergence-conditioning.md
├── panelview/
│   ├── CHANGELOG.md
│   ├── HANDOFF.md
│   ├── ref/
│   └── runs/
│       └── 2026-03-17-add-feature.md
└── README.md

This keeps target repos clean (code + essential docs only) while preserving full traceability in one place.


Shared Brain

StatsClaw has a shared knowledge system where techniques discovered during workflows — mathematical methods, coding patterns, validation strategies, simulation designs — are extracted, privacy-scrubbed, and contributed to a collective knowledge base. When you enable Brain mode, your agents get smarter by reading knowledge contributed by all users.

How it works:

  1. Read — Your agents automatically access relevant knowledge entries from statsclaw/brain
  2. Contribute — After noteworthy workflows, the distiller agent extracts reusable knowledge. You review everything and approve or decline — nothing is shared without your explicit consent. You can also run the built-in /contribute command at any time to summarize what you learned — what worked, what required manual intervention, and what domain-specific patterns emerged — and submit it as a structured report
  3. Earn badges — Accepted contributions earn virtual badges on the Contributors leaderboard

Privacy guarantee: All contributions are automatically scrubbed of repo names, file paths, usernames, proprietary code, and any identifying information. Only generic, reusable knowledge is shared.

Repo Purpose
statsclaw/brain Curated knowledge — agents read from here
statsclaw/brain-seedbank Contribution staging — users submit PRs here

Brain mode is optional — you choose at session start. See Brain System Documentation for full details.


Design Principles

  • Credentials first, work second. Verify push access before creating a run.
  • Team Leader dispatches, never does. Leader plans and coordinates; teammates do the work.
  • Multi-pipeline, fully isolated. Code, test, and simulation pipelines never see each other's specs.
  • Planner first, always. Every non-trivial request starts with dual-spec production.
  • Adversarial verification by design. Independent convergence proves correctness.
  • Hard gates, not soft advice. State transitions have preconditions; artifacts are verified.
  • Worktree isolation for writers. Builder, simulator, and scriber run in isolated git worktrees.
  • Surgical scope. Each run modifies only what the request requires.
  • Explicit ship actions. Nothing is pushed without user instruction or active patrol skill.
  • Collective knowledge, individual consent. Brain mode lets agents learn from all users, but nothing is shared without explicit per-workflow approval.

Citation

If you use StatsClaw in your research or software development, please cite our paper:

Qin, Tianzhu and Yiqing Xu. 2026. "StatsClaw: An AI-Collaborative Workflow for Statistical Software Development."

BibTeX:

@misc{qinxu2026statsclaw,
  title={StatsClaw: An AI-Collaborative Workflow for Statistical Software Development},
  author={Qin, Tianzhu and Xu, Yiqing},
  year={2026},
  howpublished = {Mimeo, Stanford University},
  url={https://bit.ly/statsclaw}
}

License

StatsClaw is released under the MIT License.


Get Involved

We are building StatsClaw in the open. Everyone is welcome.


statsclaw.ai

A tool for statisticians and econometricians. Works best with an expert in the loop.

Yorumlar (0)

Sonuc bulunamadi