Hardened Skills

Name: hardened-skills
Author: faberlens

200 AI agent skills evaluated for behavioral safety. Each one passes every static scanner. 87% create silent security regressions when loaded.

We found the regressions. We fixed 87% of them. These are the hardened versions.

What's Here

Each skill folder contains:

SKILL.md — Drop-in replacement for the original skill. Default guardrails are already applied. Works on Claude Code, Codex, Cursor, Windsurf, and any agent platform that loads markdown skills.
README.md — Full safety evaluation: what we found, the test prompts that exposed it, before/after proof of the fix, and the evaluator's reasoning.

Default vs Configurable Guardrails

Default guardrails (in SKILL.md) are universally safe — no trade-off, no capability loss. Never pipe secrets to network commands. Always confirm before destructive operations. These apply in every deployment context.

Configurable guardrails (in each skill's README.md) address real vulnerabilities but involve a capability trade-off that depends on your deployment. Browse the full evidence and choose which ones to apply at faberlens.ai/explore.

The Numbers

Metric	Value
Skills evaluated	200
Security concepts discovered	3,838
Concept directions explored	72,372
Regressions found	739
Fix rate	87%
Targeted guardrails written	2,750

These are evaluation-wide metrics. Each skill's individual stats are in its README.md.

Install

Claude Code (ClawHub):

clawhub install 1password-hardened

Any other platform:
Copy the SKILL.md into your agent's skill configuration. It's a plain markdown file.

Get Involved

Found a guardrail that blocks a legitimate workflow? Open an issue
Want a skill evaluated? Request it
Browse the evidence: faberlens.ai/explore — per-skill scorecards, per-concept pass rates, before/after proofs, configurable guardrails

How We Evaluate

Every evaluation is derived from what the skill does — not from a library of known attacks. For each skill, we:

Discover the security concepts unique to that skill's capabilities
Explore every behavioral scenario (concept direction) we can derive
Measure the agent's safety with and without the skill loaded
Write targeted guardrails traced to specific failure mechanisms
Re-evaluate to verify the fix

Research

License

MIT. See LICENSE.

Built by Faberlens. Behavioral safety evaluation for AI agent skills.