SkillSpec

Skills that agents can actually follow

You wrote a good SKILL.md. But did the agent actually follow it, or skip the
late safety rule, grab an undeclared tool, and report "done" with no proof?

SkillSpec tells you. Run one command and get a risk report. Then turn any
skill into a contract the agent has to follow, with a record you can inspect at
the end.

No new agent runtime. No orchestration platform. Just a CLI and a small
skill.spec.yml that lives next to your SKILL.md.

See It In 30 Seconds

Point Doctor at any skill, a local folder or a public GitHub URL:

skillspec doctor ./my-skill

SkillSpec Doctor
================
Target: ./my-skill        Shape: simple_skill

Agent follow-through risk: HIGH (74/100)

Findings
- description is short and generic -> automatic discovery may be unreliable
- active skill load is 8,482 tokens -> above the balanced target
- 14 must/never obligations appear after 60% of the body -> easy to miss
- tools and commands are used, but dependencies are never declared
- no tests and no progress/trace surface -> "done" can't be checked

Likely consequence
An agent may follow the broad task but skip a late safety gate, use an
undeclared tool, or claim completion without evidence.

Next step
Ask your agent: /skillspec import ./my-skill, compile it, test it, install it,
and print the alignment summary.

No install required to try it. Paste a public skill URL into the hosted page:

https://skillspec.sh/

Why This Exists

A SKILL.md is just text. The harness loads it and hopes the model reads the
right part. For a throwaway skill, that can be fine. For a skill you rely on,
"hope" is not a plan:

Buried rules get skipped. The important "never do X" sits at line 400,
and models are most reliable at the start and end of context, not the middle.
Every miss grows the prose. Each failure becomes another paragraph, which
makes the next miss more likely.
You only see the final answer. There is no durable record of which route
ran, which steps happened, or what was skipped.

SkillSpec moves the load-bearing parts out of prose and into a small structured
contract:

when to use the skill
which route to take
what is forbidden
what dependencies must exist
what checks must pass
what proof should exist at the end

Install

Install the CLI:

curl -fsSL https://skillspec.sh/install.sh | sh
skillspec --version

Or with Cargo:

cargo install skillspec
skillspec --version

Then add the plugin to your harness.

Claude Code:

claude plugin marketplace add modiqo/skillspec --sparse .claude-plugin plugins/skillspec
claude plugin install skillspec@skillspec
claude plugin list

Codex:

codex plugin marketplace add modiqo/skillspec --ref main --sparse .agents --sparse plugins/skillspec
codex plugin add skillspec@skillspec

Other platforms, pinned releases, direct downloads, and local development

Prebuilt binaries are available on the
releases page:

skillspec-macos.tar.gz
skillspec-linux-x86_64.tar.gz
skillspec-windows-x86_64.zip

Release artifacts include .sha256 checksums. The installer verifies the
checksum and writes to ~/.local/bin by default.

Pin a version or choose an install directory:

curl -fsSL https://skillspec.sh/install.sh \
  | SKILLSPEC_VERSION=v0.1.0 SKILLSPEC_INSTALL_DIR="$HOME/.local/bin" sh

Install unreleased main:

cargo install --git https://github.com/modiqo/skillspec --package skillspec --force
skillspec --version

Install from a local checkout:

cargo install --path crates/skillspec-cli --force
skillspec --version

Local development can also install the skill folder directly:

# Codex
skillspec install skill skills/skillspec --target codex --retire-existing

# Agents
skillspec install skill skills/skillspec --target agents --retire-existing

# Claude local project
skillspec install skill skills/skillspec --target claude-local --retire-existing

Full install notes:
docs/install

The Loop: Assess -> Port -> Prove

Once the plugin is installed, ask your agent for the outcome in chat. SkillSpec
picks the commands and keeps the run aligned.

1. Assess a skill before you touch it.

/skillspec run doctor on ./my-skill

You get a baseline: discovery risk, context load, buried obligations,
undeclared dependencies, missing proof, and the likely consequence for agent
follow-through.

2. Port it into a contract.

/skillspec import ./my-skill, compile it for Codex, install it, and prove it

SkillSpec generates a skill.spec.yml next to your SKILL.md: routes, rules,
forbidden actions, dependencies, checks, tests, and proof expectations. It also
compiles a thin loader so the active prompt stays small.

3. Prove it ran the way it was supposed to.

Every run can leave an alignment summary you can read: selected route,
completed steps, missing proof, forbidden-action status, token usage, and wall
clock metrics when available. Not just "done" - a record.

Crowded skill library?

/skillspec install router

Router mode routes to the one skill that matters instead of making the harness
expose too many skills at once.

What SkillSpec Is, And Is Not

Four things you can do with it:

Import an existing prose SKILL.md into a structured SkillSpec contract.
Run a SkillSpec-backed skill in your harness, then review the alignment
and token report.
Route many skills through an explicit router when harness listing budgets
make discovery unreliable.
Capture durable execution traces and turn observed CLI/API/MCP work into
reusable skills. This path is powered by Rote.

It is	It is not
A contract that sits beside `SKILL.md`.	A replacement for skills.
A CLI that scores, ports, compiles, and records.	A new agent runtime or orchestration platform.
A way to make skills easier to compare across Codex, Claude, and Agents.	A promise that every harness will behave identically.
A run record you can audit after the task.	A security sandbox.

That last row matters. SkillSpec makes a run auditable: you can see what was
claimed and check it against the contract. Enforcement of tool boundaries is
still the harness's job.

Public Doctor Reports

Want to check a public skill before installing or porting it? Use the hosted
Doctor page:

https://skillspec.sh/

You can also open a
Doctor report request
with a public GitHub skill repo or folder URL. GitHub Actions validates the
target, runs skillspec doctor, comments with a Markdown report, and attaches
Markdown, HTML, JSON, and text artifacts.

Private repositories are not inspected by public Actions. For private skills,
install SkillSpec locally:

skillspec doctor /path/to/local/skill
skillspec doctor /path/to/local/skill --markdown > skillspec-doctor.md
skillspec doctor /path/to/local/skill --html > skillspec-doctor.html

Use Doctor as the baseline. Then ask your harness to import the skill:

/skillspec import <skill-repo-or-folder>, compile it, verify it, test it, and prove it. Print the alignment summary.

Publish the baseline report, generated skill.spec.yml, compiled loader, and
alignment report with the repo or pull request so reviewers can see both the
original skill risk and the proof after porting.

Why The Scores Are Credible

Doctor is not vibes. Every risk condition cites published work or local
SkillSpec methodology on how agents fail: context-position effects, effective
context limits, verifiable instruction following, process-level agent
evaluation, and skill-metadata routing.

The report is explicit about what is measured versus what is a policy
threshold. Start here:

The contract itself is a real spec: a typed Rust model, JSON Schema, reference
grammar, and
conformance suite.

Learn More

License

SkillSpec is dual-licensed under either:

You may choose either license. Contributions are accepted under the same dual
license unless explicitly stated otherwise.