hardened-skills

mcp
Security Audit
Warn
Health Warn
  • License — License: MIT
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Low visibility — Only 6 GitHub stars
Code Pass
  • Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Pass
  • Permissions — No dangerous permissions requested
Purpose
This project provides 200 drop-in, markdown-based replacement skills for AI coding agents. It adds behavioral guardrails to prevent existing agent skills from executing destructive actions or leaking secrets.

Security Assessment
Overall Risk: Low. The automated code scan checked 12 files and found no dangerous patterns, no hardcoded secrets, and no requests for dangerous permissions. The tool does not request sensitive access or execute shell commands itself; instead, it is designed to restrict agents from doing so. The only minor security consideration is the use of a third-party CLI installer (`clawhub install`), which is standard but requires standard caution. Overall, the code is transparent and does exactly what it claims to do by acting as a safety filter.

Quality Assessment
The project is MIT licensed and actively maintained, with repository activity as recent as today. However, community visibility and trust are currently very low. The repository has only 6 GitHub stars, indicating that it has not yet been widely peer-reviewed or battle-tested by a large user base.

Verdict
Use with caution — the code is currently safe and well-intentioned, but its low community adoption means it has not yet undergone widespread public scrutiny.
SUMMARY

200 AI agent skills, hardened with targeted behavioral guardrails. Free drop-in replacements.

README.md

Hardened Skills

200 AI agent skills evaluated for behavioral safety. Each one passes every static scanner. 87% create silent security regressions when loaded.

We found the regressions. We fixed 87% of them. These are the hardened versions.

What's Here

Each skill folder contains:

  • SKILL.md — Drop-in replacement for the original skill. Default guardrails are already applied. Works on Claude Code, Codex, Cursor, Windsurf, and any agent platform that loads markdown skills.
  • README.md — Full safety evaluation: what we found, the test prompts that exposed it, before/after proof of the fix, and the evaluator's reasoning.

Default vs Configurable Guardrails

Default guardrails (in SKILL.md) are universally safe — no trade-off, no capability loss. Never pipe secrets to network commands. Always confirm before destructive operations. These apply in every deployment context.

Configurable guardrails (in each skill's README.md) address real vulnerabilities but involve a capability trade-off that depends on your deployment. Browse the full evidence and choose which ones to apply at faberlens.ai/explore.

The Numbers

Metric Value
Skills evaluated 200
Security concepts discovered 3,838
Concept directions explored 72,372
Regressions found 739
Fix rate 87%
Targeted guardrails written 2,750

These are evaluation-wide metrics. Each skill's individual stats are in its README.md.

Install

Claude Code (ClawHub):

clawhub install 1password-hardened

Any other platform:
Copy the SKILL.md into your agent's skill configuration. It's a plain markdown file.

Get Involved

  • Found a guardrail that blocks a legitimate workflow? Open an issue
  • Want a skill evaluated? Request it
  • Browse the evidence: faberlens.ai/explore — per-skill scorecards, per-concept pass rates, before/after proofs, configurable guardrails

How We Evaluate

Every evaluation is derived from what the skill does — not from a library of known attacks. For each skill, we:

  1. Discover the security concepts unique to that skill's capabilities
  2. Explore every behavioral scenario (concept direction) we can derive
  3. Measure the agent's safety with and without the skill loaded
  4. Write targeted guardrails traced to specific failure mechanisms
  5. Re-evaluate to verify the fix

Research

License

MIT. See LICENSE.


Built by Faberlens. Behavioral safety evaluation for AI agent skills.

Reviews (0)

No results found