AI-skills-bank
Health Warn
- No license — Repository has no license file
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Low visibility — Only 5 GitHub stars
Code Fail
- spawnSync — Synchronous process spawning in bin/skills-bank.js
- process.env — Environment variable access in bin/skills-bank.js
Permissions Pass
- Permissions — No dangerous permissions requested
No AI report is available for this listing yet.
AI Skills Bank is a unified, multi-tool platform designed to aggregate, manage, and route AI skills across various workflows and AI assistants (such as Antigravity, Claude Code, Cursor, and Copilot).
� Prerequisites
- Rust 1.70+ (Install)
- Git (for repository cloning)
- ~2GB disk space (for aggregated skills cache)
�📖 Overview
skills-bank aggregates skills (workflows, tasks, specialized agents) from 100+ distributed repositories and provides a unified routing system for AI agents to discover, load, and invoke them efficiently.
Core Design Principles
- Source-of-Truth Loading: Agents load canonical
SKILL.mdfiles directly from source repositories, not from catalogs. This eliminates hallucination risks and optimizes token usage. - Hybrid Classification: A dual-stage pipeline combines fast keyword rules (Step A) with LLM-powered semantic classification (Step B) to route skills into 12 domain hubs and 40+ sub-hubs.
- Smart Deduplication: Skills are deduplicated by name OR description — catching both exact collisions and cross-repo clones with different names but identical content.
- Multi-Tool Support: Skills sync to major AI tools including GitHub Copilot, Claude-code, free-code (claude-code), Hermes, Cursor, Gemini, Antigravity, OpenCode, Codex, and Windsurf.
- Token Efficiency: Load minimal metadata first, then source files on-demand—not batch-loading entire catalogs.
🚀 Quick Start
1. Build the CLI
cd skills-bank/
cargo build --release
cargo run --release -- aggregate
2. Run the Full Pipeline
# Interactive setup (first run)
cargo run --release
# Or run all steps in sequence
cargo run --release -- run
Example Workflows
First-time setup:
cargo run --release -- setup
cargo run --release -- run
Validate before production sync:
cargo run --release -- doctor
cargo run --release -- release-gate
cargo run --release -- sync
Launches an interactive wizard to configure:
- Where skills should be synced (global, workspace, or both)
- Which AI tools to sync to
- Repository URLs to clone and aggregate
- Excluded categories
🎮 Commands Reference
Core Pipeline Commands
| Command | Purpose | When to Use |
|---|---|---|
aggregate |
Collect, deduplicate, classify, and route skills from configured repositories to skills-aggregated/ |
First run or when repositories change |
sync |
Distribute aggregated skills to configured AI tool directories | After aggregation completes |
run |
Execute the full pipeline (aggregate → sync) in sequence | Daily updates or automated workflows |
setup |
Configure sync targets, repositories, and exclusions interactively | Initial setup only |
add-repo <URL> |
Add a new skill repository to the configuration | When onboarding new sources |
doctor |
Validate installation and report repository state | Troubleshooting or pre-cleanup inspection |
release-gate |
Validate aggregation output integrity | Before releases or production sync |
cleanup-legacy-duplicates |
Remove legacy repository folders from src/ or repos/ (only if matching lib/ exists) |
Migration from older versions |
📁 Project Structure
Source Code & Configuration
- src/ — Rust source code: TUI, fetcher, aggregator, sync engine, classification logic
- Cargo.toml — Rust manifest (dependencies, metadata, build targets)
- .skills-bank-cli-config.json — User configuration file (generated by
setup, contains sync targets and repository URLs) - .env-example — Environment variable template
Generated Outputs (After Aggregation)
- skills-aggregated/ — Single source of truth containing:
routing.csv— Skill-to-hub/sub-hub routing tablesubhub-index.json— Hub and sub-hub registryhub-manifests.csv— Master index of all skills.skill-lock.json— Aggregation metadata and timestamps- Per-hub directories with
skills-manifest.jsonfiles
Repository Cache
- lib/ — Canonical cache for cloned skill repositories (populated by
aggregatecommand)
Testing & Documentation
- tests/ — Integration test suite for pipeline and TUI
- archive/ — Legacy PowerShell scripts (original PoC phase)
- package.json — Node.js manifest for
npxdistribution - readme.md — This file
📁 Repository Management
Cloning & Caching
Cache Location: lib/ (not src/) — This is the canonical directory for all cloned repositories.
Clone Strategy:
- First clone: Shallow clone with
git clone --depth 1 --single-branch --no-tags(faster, smaller disk footprint) - Subsequent runs:
git pullin existing directories (avoid re-cloning) - Deduplication: Normalized remote URLs and repository names prevent duplicate clones
Speed Optimization:
- Parallel cloning via configurable
PARALLEL_JOBS - Shallow clones reduce disk I/O by ~80% vs. full clones
- Incremental updates via
git pull
Legacy Repository Cleanup
If you have repositories in older locations (src/ or repos/), migrate them:
# Inspect current state
cargo run --release -- doctor
# Remove legacy folders (safe: only deletes if matching lib/ exists and Git remote matches)
cargo run --release -- cleanup-legacy-duplicates
⚠️ Warning: This is destructive. Always run doctor first to inspect repository state.
⚙️ Output Files & Configuration
Generated during aggregation into skills-aggregated/:
| File | Purpose |
|---|---|
routing.csv |
Skill-to-hub/sub-hub mappings (name, hub, sub-hub, src_path) |
subhub-index.json |
Complete hub and sub-hub registry |
hub-manifests.csv |
Master index of all skills across all hubs |
.skill-lock.json |
Aggregation metadata (timestamps, repo revisions, dedup stats) |
[hub]/[sub-hub]/skills-manifest.json |
Per-sub-hub skill metadata and LLM classification triggers |
These files are used by agents and the TUI for discovery and routing.
🌐 Environment Variables
Copy .env-example to .env to override defaults:
cp .env-example .env
See .env-example for all available options.
🎯 Tool Integration Targets
Sync skills to any of these destinations:
| Tool | Project | Global |
|---|---|---|
| Claude | .claude/skills/ |
~/.claude/skills/ |
| free-code (claude-code) | .free-code-config/skills/ |
~/.free-code-config/skills/ |
| Hermes | .hermes/skills/ |
~/.hermes/skills/ |
| Code (Codex) | .agents/skills/ |
~/.agents/skills/ |
| GitHub Copilot | .github/skills/ |
~/.copilot/skills/ |
| Cursor | .cursor/skills/ |
~/.cursor/skills/ |
| Gemini | .gemini/skills/ |
~/.gemini/skills/ |
| Antigravity | .agent/skills/ |
~/.gemini/antigravity/skills/ |
| OpenCode | .opencode/skills/ |
~/.config/opencode/skills/ |
| Windsurf | .windsurf/skills/ |
~/.codeium/windsurf/skills/ |
🏗️ Classification Architecture
The aggregation pipeline processes 8000+ SKILL.md files through a multi-stage classification system:
SKILL.md files (8000+)
│
▼
┌──────────────┐
│ YAML Parse │ Extract name, description, triggers
└──────┬───────┘
│
▼
┌──────────────┐
│ Keyword │ Fast token-based routing to hub/sub-hub
│ Rules │ (fallback if LLM unavailable)
└──────┬───────┘
│
▼
┌──────────────┐
│ Dedup │ Name OR Description HashSet
│ (two-key) │ Catches cross-repo clones
└──────┬───────┘
│
▼
┌──────────────────────────────────┐
│ Hybrid Exclusion + LLM Classify │
│ Step A: Keyword pre-filter │
│ Step B: LLM semantic classify │
│ (can return "excluded") │
└──────┬───────────────────────────┘
│
▼
┌──────────────┐
│ Output │ routing.csv, per-hub manifests,
│ Artifacts │ skills-index.json
└──────────────┘
🔍 Classification Improvements (v2.0+)
The keyword-based classification system includes three critical enhancements to eliminate false negatives and resolve sub-hub conflicts:
1. Repository Name Extraction (Substring Matching)
Problem: Repository names like mukul975-anthropic-cybersecurity-skills were not being matched because the system used exact token matching (e.g., only matching the token "security", not the full repo name).
Solution: Introduced infer_hub_from_repo_name() function that:
- Extracts the repository directory name from the path (the segment right after
lib/orsrc/) - Uses substring matching to catch domain signals (e.g.,
"cybersecurity-skills"→ matches"security") - Runs before other inference logic (highest priority)
Confidence Score: 98% (near-deterministic, reflects author intent)
2. Sub-Hub Conflict Resolution
Problem: When a skill matched multiple sub-hubs (e.g., python AND security simultaneously), language hubs often won due to their anchor keywords, defeating domain-specialist classification.
Solution: Introduced conflict resolution table (CONFLICT_RESOLUTION) that:
- Defines precedence rules when multiple sub-hubs match:
(losing_hub, losing_sub_hub, winning_hub, winning_sub_hub) - Ensures domain specialists always win over languages:
security>python|javascript|typescript|rust|golang|javatesting-qa>python|javascript|typescript|rustcode-review>python|javascript
- Applied in
resolve_conflict()function when multiple candidates score within 5 points of the top score - Fallback: hub priority ordering if no explicit rule applies
3. Confidence Boost for Path-Based Inference
Problem: Repository name signals (inferred from path) were scored 95%, allowing lower-confidence LLM results (80%) to potentially override them.
Solution: Raised the confidence score for path-based inference from 95 → 98%
- Score 98 is now treated as near-deterministic (same tier as explicit
canonicalize_assignmentlogic at 100) - Only scores ≥ 100 can override it
- Prevents low-confidence LLM results from contradicting repository metadata
📊 Example Classification Flow
For a skill in lib/mukul975-anthropic-cybersecurity-skills/:
1. apply_rules() called
↓
2. canonicalize_assignment() → no match (0% confidence)
↓
3. infer_from_path() called
├─ infer_hub_from_repo_name() extracts "mukul975-anthropic-cybersecurity-skills"
├─ Finds substring match: "cybersecurity"
└─ Returns ("code-quality", "security") with 98% confidence
↓
4. ✓ Final assignment: code-quality / security
✗ LLM classification skipped (98% > 80% threshold)
🔧 Troubleshooting
Issue: Skills not aggregating or taking too long
Check repository state:
cargo run --release -- doctor
This validates all repositories, checks Git remotes, and reports cache status.
Increase parallelism:
export PARALLEL_JOBS=16
cargo run --release -- aggregate
Issue: Sync failing with "junction or symlink" errors
Cause: Existing junctions in sync target directories.
Solution: The sync command automatically skips existing junctions. If conflicts persist:
# Inspect sync targets
dir ~/.claude/skills # Windows
ls ~/.claude/skills # macOS/Linux
# Remove conflicting junctions/symlinks manually
rmdir /s ~/.claude/skills\[hub-name] # Windows
rm -rf ~/.claude/skills/[hub-name] # macOS/Linux
# Retry sync
cargo run --release -- sync
Issue: "Release gate" validation fails
Check output integrity:
cargo run --release -- release-gate
This validates:
- All
SKILL.mdfiles were processed - No orphaned or missing references in
routing.csv - Deduplication stats match cache state
If failures reported, re-run aggregation:
rm -rf skills-aggregated/
cargo run --release -- aggregate
📈 Performance Characteristics
| Operation | Time | Dependencies |
|---|---|---|
| First aggregate (120+ repos, 8000+ skills) | 10-20 min | Network speed, CPU count, LLM latency |
| Incremental aggregate (repos already cached) | 2-5 min | LLM classification speed (can skip with --skip-llm) |
| Sync to tools (10 tools, all hubs) | 30-60 sec | Disk I/O, junction creation speed |
| LLM classification (8000 skills) | 3-8 min | Batch size, LLM throughput |
Optimization Tips:
- Use
PARALLEL_JOBS=autofor optimal CPU utilization - Set
LLM_BATCH_SIZE=100for faster LLM processing (requires more GPU/API quota) - Run on an SSD for 2-3x faster repository cloning
- Use shallow clones (default) to reduce disk bandwidth
Reporting Issues
When reporting bugs, include:
- Output of
cargo run --release -- doctor - Contents of
.skills-bank-cli-config.json(redact sensitive URLs if needed) - Error message and stack trace (if any)
- Steps to reproduce
Extending Classification
To add new domain keywords or refine sub-hub routing:
- Edit
src/classify.rs→CONFLICT_RESOLUTIONtable or keyword rules - Add test cases in
tests/ - Run
cargo testandcargo run --release -- aggregate - Submit PR with classification examples
📄 License
MIT — See package.json for details.
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found