autoharness
Health Pass
- License — License: MIT
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Community trust — 17 GitHub stars
Code Pass
- Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Pass
- Permissions — No dangerous permissions requested
No AI report is available for this listing yet.
A self-learning skill layer for Claude Code — distills skills from your real sessions, updates them as you work, and prunes the ones that stop getting used. No daemon, no benchmark.
autoharness
autoharness is a self-learning skill layer for Claude Code. It learns skills from your real
sessions, merges same-scenario ones instead of stacking near-duplicates, updates them in use,
and prunes any that stop getting used — so the layer stays clean on its own, touching only
the skills it wrote itself.
Same model, different harness — 42% → 78% on CORE-Bench (HAL).
The harness does much of the work (swyx's Big Model vs Big Harness), yet it's still rebuilt by
hand every model generation. autoharness bets one slice of it — the skill layer — can maintain itself.
| Learns from real work | Each episode is distilled into a skill from the session you were already having — no separate data-collection or replay loop. |
| Groups, doesn't just pile up | A new episode doesn't always add a skill — the reflector compares it against what's there and folds same-scenario skills into one, so the layer consolidates by category instead of accreting near-duplicates. |
| Validated in use, not on a benchmark | A skill survives by being adhered to in later turns (invocation rate), not a held-out score. No oracle on the active path, and no tokens spent on a dedicated eval. |
| Only its own skills | Touches only the skills it generated through this plugin — everything else, whether you wrote it or installed it, is left completely alone. |
| Evidence kept for later | Every create/update logs its scenario and decision to a per-skill ledger — the raw material to build a benchmark from real usage if you ever want one. |
Install
Requires python3 on your PATH — autoharness runs entirely as Python (zero third-party
dependencies); its hooks and MCP server won't fire without it.
/plugin marketplace add tigerless-labs/autoharness
/plugin install autoharness@autoharness
Restart Claude Code, then check it's live:
/plugin # autoharness@autoharness — enabled
/mcp # stage_skill — connected
Zero config. It now watches your sessions and lands learned skills into .claude/skills/ in the
background. Cadence and lifecycle thresholds are tunable via AUTOHARNESS_* environment variables.
Uninstall
claude plugin uninstall autoharness@autoharness # stops the hooks + MCP server
claude plugin marketplace remove autoharness # optional — also drops the install source
Uninstalling only stops it from running — the skills it landed and its own state live outside the
plugin and stay on disk. To clear those too, delete its state dir (~/.claude/autoharness/ global,<repo>/.claude/autoharness/ per project) and the self-authored skills under .claude/skills/ (each
carries a self-authored ledger marker, so they're easy to tell from yours). Your own skills are
never touched.
How it works
A learning pipeline runs beside the host and stays off its recall path — symbols are plain native
skills, recalled by the host's own name-and-description mechanism as if a human had written them.
Diagram source: docs/assets/pipeline.mmd — re-render to pipeline.svg after editing.
| Component | Role |
|---|---|
| CAP · capture | Hook-driven dumb pipe: grabs each turn (user input, agent output, tool I/O), redacts at egress, points back at the host log instead of copying it. |
| REF · reflect | At an episode boundary, reads the existing skill index and decides add / merge / patch / delete — emits an intent (body or delta, plus reason and evidence). Proposes only; no write tools. |
| promoter · validate·store | The only writer. Lints the intent in memory (safety, structure, ledger, completeness, self-authored-only) and on pass does an atomic rename into the live skill directory. |
| MNG · lifecycle | Daemon-free. Ranks symbols by invocation rate per layer, shields new ones during probation, archives the weakest when a mature pool is over capacity. Archives, never deletes. |
| LED · ledger | Per-symbol append-only sidecar: why each symbol was born or changed, with evidence and a reflection watermark. Kept out of the skill body so recall stays clean. |
How it compares
A self-learning skill layer can be validated against a held-out benchmark, or against its own use.
autoharness takes the second — cheaper, and it works on a live host doing open-ended work where no
benchmark exists.
| Grow unbounded | Offline-gated self-edit (Self-Harness) |
Timer + daemon (hermes-agent) |
autoharness | |
|---|---|---|---|---|
| Bounds the skill layer | No | Yes | Yes | Yes |
| Validation signal | None | Held-out benchmark score | Wall-clock inactivity | Adherence in use |
| Needs a benchmark / oracle | No | Yes | No | No |
| Needs a resident daemon | No | No | Yes | No |
Acknowledgements
NousResearch/hermes-agent — studying its
auto-skill-creation and memory-consolidation design helped sharpen autoharness's adherence-based,
daemon-free take.
Built by Tigerless Labs.
License
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found