platform-skills
Health Gecti
- License — License: Apache-2.0
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Community trust — 34 GitHub stars
Code Gecti
- Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Gecti
- Permissions — No dangerous permissions requested
Bu listing icin henuz AI raporu yok.
A platform engineering handbook covering Kubernetes, OpenShift, Argo CD, Flux CD, AWS, Azure, Terraform, and GitHub Actions — with an optional Claude plugin layer for interactive guidance.
Platform Skills
A production-grade field handbook for platform, DevOps, SRE, and cloud engineers covering Kubernetes, Flux CD, Terraform, GitHub Actions, AWS, OPA/Rego, KEDA, Karpenter, supply chain security, Falco, observability, and more. Use it on GitHub, as a local reference, or with Claude, Codex, Cursor, and Copilot for interactive guidance with blast radius, validation steps, and rollback plans built in.
Works With
| Tool | What you get |
|---|---|
| Claude Code | 40 slash commands (/platform-skills:preflight, /platform-skills:kubernetes, /platform-skills:secrets, and more), interactive guidance, automatic activation on relevant files |
| Codex | Skill invocation with $platform-skills, loaded on demand in any Codex session |
| Cursor | Project rules for Chat and Agent — platform review and generation in every file context |
| GitHub Copilot plugin | Interactive slash commands in Copilot Chat — install once per user via copilot plugin install |
| GitHub Copilot (team) | Chat instructions committed to your repo — available to your whole team without individual installs |
| GitHub (no AI tool) | Browse references/ and examples/ directly — a standalone field handbook |
If this handbook saves you time, give it a star — it helps others find it.
Found a gap or a better pattern? Contributions are welcome — open an issue, improve a reference guide, or add an example.
Why Platform Skills
Platform teams keep rediscovering the same hard lessons: unclear ownership, unsafe IAM, weak Kubernetes defaults, drifting GitOps overlays, CI checks that run too late, and rollback plans that only appear after an incident. Platform Skills turns those lessons into reusable guidance for the tools engineers already use.
Use it when you need a second brain for production platform work:
- Review a Terraform, Helm, Kubernetes, Flux, GitHub Actions, or AWS change before it merges
- Generate platform assets with security, observability, validation, and rollback already considered
- Debug incidents with evidence-first troubleshooting instead of guesswork
- Give every developer the same platform engineering baseline in Claude, Codex, Cursor, and Copilot
Install In 60 Seconds
Clone once, then install the integration your team uses:
git clone https://github.com/nitinjain999/platform-skills.git
cd platform-skills
| Tool | Best for | Quick install |
|---|---|---|
| Claude Code | Interactive plugin workflows and slash commands | claude plugin marketplace add https://github.com/nitinjain999/platform-skills && claude plugin install platform-skills |
| GitHub Copilot plugin | Copilot Chat — interactive slash commands | copilot plugin marketplace add nitinjain999/platform-skills && copilot plugin install platform-skills@platform-skills |
| Codex | Local skill invocation with $platform-skills |
./install.sh --codex |
| Cursor | Project rules for Chat and Agent | ./install.sh --cursor --target ../your-project |
| GitHub Copilot (team) | Team-wide chat instructions committed to the repo | ./install.sh --copilot --target ../your-project |
| Everything | Local all-agent setup | ./install.sh --all --target ../your-project |
Need manual setup, global editor rules, or troubleshooting? See INSTALLATION.md.
Try It On Your Repo
See BEFORE_AFTER.md for side-by-side before/after examples across Kubernetes, Terraform, Flux CD, GitHub Actions, OPA/Rego, and PR triage. More copy-paste workflows in PROMPTS.md.
Use $platform-skills to review this Terraform change for IAM scope, replacement risk, validation, and rollback.
Review this Kubernetes Deployment for production readiness: securityContext, resources, probes, HPA, PDB, and NetworkPolicy.
My Flux Kustomization is stuck NotReady. Walk me from evidence to fix to rollback.
Generate a production-ready GitHub Actions workflow with OIDC, pinned actions, cache safety, and least privilege.
What is this?
This repository is a reference handbook for developers, DevOps engineers, SREs, cloud engineers, and platform teams. It is structured in independent layers:
- Handbook —
references/andexamples/are the main product. Every domain has a deep-dive guide and working example assets you can copy directly into your project. Use it on GitHub, from a local clone, or as a team knowledge base. - Claude plugin —
SKILL.mdand.claude-plugin/marketplace.jsonadd an optional routing layer so Claude surfaces the right section of the handbook when you ask platform engineering questions interactively. - Codex skill — the repo root is a self-contained skill folder:
SKILL.mdprovides routing,agents/openai.yamlprovides Codex UI metadata, andreferences/plusexamples/are loaded on demand. - Cursor rules —
.cursorrulesand.cursor/rules/*.mdcgive Cursor project-level and scoped file rules for platform engineering reviews and generation. - Copilot instructions —
.github/copilot-instructions.mdlets teams commit the baseline into application and platform repositories.
All layers work independently. Agent integrations are optional.
Navigate
| I want to... | Go to |
|---|---|
| Get started in 5 minutes | QUICKSTART.md |
| Understand how AI agents and skills work | HOW_IT_WORKS.md |
| Full installation guide and troubleshooting | INSTALLATION.md |
| Read a domain guide | references/ |
| Copy a working example | examples/ |
| Copy prompts for Claude, Codex, Cursor, or Copilot | PROMPTS.md |
| Install as a Claude plugin | Installation |
| Install as a Codex skill | Installation |
| Add Cursor rules | Editor integrations |
| Learn how to use each slash command | COMMANDS.md |
| Set up VSCode, Copilot, or Cursor | EDITOR_INTEGRATIONS.md |
| Contribute a pattern | CONTRIBUTING.md |
Domains
| Domain | Reference guide | What it covers |
|---|---|---|
| references/kubernetes.md | Cluster baseline, workload patterns, network policy, RBAC, pod security — /platform-skills:kubernetes |
|
| 🛡️ Kyverno | references/kyverno.md | Validate/mutate/generate/verifyImages policies, Audit→Deny promotion, PolicyException, PolicyReport, kyverno-cli testing, PSP/Gatekeeper migration |
| references/openshift.md | SCC diagnosis, Routes TLS, OpenShift GitOps delivery, cluster upgrade validation — /platform-skills:openshift |
|
| references/argocd.md | App-of-apps design, ApplicationSet, sync control, promotion flows | |
| references/fluxcd.md | Monorepo structure, reconciliation, multi-tenancy, image automation. Deep-dive guides: Sources, HelmRelease, Kustomization, Notifications, Operator, Security, Troubleshooting, Migration, ResourceSets, MCP, Terraform bootstrap — /platform-skills:fluxcd |
|
| references/aws.md | IAM least-privilege, IRSA, EKS, resource tagging, cost allocation | |
| references/aws-cloudfront.md | Distributions, OAC, cache policies, security headers, Lambda@Edge, CloudFront Functions, multi-account | |
| references/aws-waf.md | Web ACLs, managed rule groups, rate limiting, Bot Control, Firewall Manager, Shield Advanced | |
| references/azure.md | Workload identity, AKS, RBAC, resource tagging, Azure Policy — /platform-skills:azure |
|
| references/terraform.md | Module design, state management, testing, CI/CD integration | |
| references/checkov.md | Bootstrap, static + plan scanning, multi-cloud provider detection, fix mode, custom checks | |
| 🔍 Trivy | references/trivy.md | Image, fs, repo, SBOM, and cluster CVE scanning; severity gates; Trivy Operator via Flux |
| references/github-actions.md | Security hardening, OIDC, SHA pinning, reusable workflows — /platform-skills:github-actions |
|
| references/composite-actions.md | Composite action scaffolding, review, hardening, testing, release, private repo access | |
| 🗺️ Platform model | references/platform-operating-model.md | Ownership boundaries, promotion flows, cross-tool design |
| 🔐 Secrets | references/secrets.md | External Secrets Operator, Sealed Secrets, provider setup, troubleshooting — /platform-skills:secrets |
| references/linkerd.md | mTLS, proxy injection, AuthorizationPolicy, observability, multi-cluster | |
| references/linux-networking.md | Linux admin, DNS, load balancing, VPC/VNet design, connectivity troubleshooting | |
| 🧠 Platform Mindset | references/platform-mindset.md | DevEx, friction audits, RFC/ADR, incident comms, post-mortems, capacity planning |
| 🔒 Compliance | references/compliance.md | SOC 2 Trust Services Criteria in Terraform: IAM, encryption, detection, audit logging, backup, Checkov enforcement |
| references/helm.md | Chart scaffolding, values design, template patterns, security hardening, lint/validation pipeline, GitOps integration | |
| 🔌 MCP | references/mcp.md | Model Context Protocol server/client development, TypeScript and Python SDKs, stdio/SSE transports, security, testing |
| ☁️ AWS MCP Profiles | references/aws-mcp-profiles.md | Multi-account AWS MCP server management — SSO, Granted, credential_process, profile discovery, VS Code and Claude Code config generation |
| references/observability.md | Structured logging, Prometheus metrics, OpenTelemetry tracing, Grafana dashboards, alerting rules, k6 load testing, capacity planning | |
| 📝 Documentation | references/documentation.md | Docstrings (Google/NumPy/JSDoc), OpenAPI 3.1 specs, doc sites (MkDocs/TypeDoc), developer guides |
| references/datadog.md | Agent Helm setup, APM instrumentation, log management, monitors/dashboards/SLOs as Terraform, pup CLI, Datadog Labs skills | |
| 🤖 LLM Observability | references/llm-observability.md | Datadog LLMObs instrumentation (Python/Node.js), eval bootstrap, trace RCA, experiment analysis |
| references/dynatrace.md | OneAgent Kubernetes Operator, custom metrics, SLOs, dashboards and alerting via Terraform provider | |
| references/conventional-commits.md | Message structure, type classification, atomic staging, commitlint/husky/semantic-release tooling | |
| 📋 OPA / Conftest | references/opa.md | Rego v1 syntax, rule types, unit tests, fmt/regal/verify validation pipeline, GitHub Actions integration |
| 🔍 PR Review | references/pr-review.md | Cost impact, environment drift, ownership gaps, SOC 2 compliance, deprecated API / version hygiene, rollback feasibility |
| 🧵 PR Comment Triage | commands/triage.md | /platform-skills:triage classifies PR comments, applies valid fixes, replies, and resolves review threads |
| ⚡ KEDA | references/keda.md | ScaledObject, ScaledJob, TriggerAuthentication, Prometheus/SQS/Kafka/Redis/Cron/HTTP/Azure scalers, scale-to-zero, IRSA, GitOps integration, troubleshooting — /platform-skills:keda |
| ⚙️ Karpenter | references/karpenter.md | EKS node autoscaling — NodePool, EC2NodeClass, NodeClaim, Spot diversity, disruption budgets, ODCR, private clusters, Fargate coexistence, FinOps, CA migration, v0→v1 upgrades — /platform-skills:karpenter |
| 🤖 Agent Self-Improvement | references/agent-self-improve.md | .learnings/ directory setup, LRN/ERR/FEAT entry lifecycle, WAL protocol, working buffer, VFM scoring, ADL decision logic, Six Operating Pillars, heartbeat, reverse prompting, proactive agent behavior — /platform-skills:self-improve |
| 🔗 Supply Chain Security | references/supply-chain.md | Cosign keyless signing, Syft SBOM generation and attestation, Trivy/Grype CVE scanning with severity gates, SLSA Level 2 provenance, Kyverno ImageValidatingPolicy enforcement — /platform-skills:supply-chain |
| 🦅 Runtime Security | references/runtime-security.md | Falco eBPF deployment on EKS/GKE, custom rule authoring, Falcosidekick alert routing, rule debugging, bridging Falco signals to Kyverno admission enforcement — /platform-skills:runtime-security |
| 💥 Chaos Engineering | references/chaos.md | Litmus Chaos v3 and Chaos Mesh v2 fault injection, steady-state hypothesis (httpProbe/promProbe), blast radius scoping, GameDay workflow, recurring schedules, DORA feedback loop — /platform-skills:chaos |
| 📊 DORA Metrics | references/dora.md | Deployment Frequency, Lead Time, Change Failure Rate, MTTR — GitHub Actions + Prometheus Pushgateway instrumentation, recording rules, Grafana dashboards, SaaS decision matrix, anti-pattern detection — /platform-skills:dora |
| ✨ Awesome Docs | references/awesome-docs.md | Animated GitHub-safe Markdown document generation — any doc type (README, architecture guide, runbook, tutorial, API reference, RFC, post-mortem, or custom), 4 SVG patterns, convert existing docs, diff for staleness, audit quality, local preview, multi-platform export — /platform-skills:awesome-docs |
| 🔄 Renovate | references/renovate.md | Dependency update automation — scan repo and generate renovate.json per ecosystem, private registry auth (ECR/GCR/ACR/Harbor/Helm OCI), custom regex managers for internal GitHub modules and private Terraform registries, pre-commit hook, GitHub Actions validation workflow — /platform-skills:renovate |
| 🤖 Setup Agents | references/setup-agents.md | Multi-agent AI scaffold for any repo — ranked scan, interview-driven generate/upgrade/add/review modes, GitHub Copilot, Claude Code, Cursor, Codex, and Windsurf configs — /platform-skills:setup-agents |
Core principles
Every pattern in this handbook follows the same ground rules:
- Production-first — patterns are battle-tested, not theoretical
- Root-cause over symptom — troubleshooting works backwards from evidence to fix
- Explicit blast radius — every risky operation documents scope and rollback
- Security by default — least-privilege IAM, restricted pod security, SHA-pinned actions
- Rollback plans are mandatory — if you cannot safely undo it, the guide is incomplete
Troubleshooting structure
Every troubleshooting section in the handbook follows this consistent framework — from quick diagnosis to safe resolution:
| Step | What it answers |
|---|---|
| Symptom | Exact error and observable behavior |
| Evidence | Commands to run: logs, events, status |
| Hypothesis | Most likely root cause |
| Diagnosis | Commands that confirm or rule out the hypothesis |
| Fix | Specific change with justification |
| Validation | Post-fix verification steps |
| Prevention | How to avoid it next time |
| Rollback | Safe undo path if the fix makes things worse |
Installation
Browse on GitHub
No installation needed. Navigate directly:
- examples/ — copy-paste examples for all domains
- references/ — deep-dive domain guides
- SKILL.md — core patterns and routing logic
Clone for local templates
git clone https://github.com/nitinjain999/platform-skills.git
cd platform-skills
# Copy examples directly into your project
cp -r examples/flux/basic-monorepo/* your-gitops-repo/
cp -r examples/terraform/eks-cluster/* your-terraform-modules/
cp examples/kubernetes/deployment-baseline.yaml your-k8s-manifests/
Install as a Claude plugin
The plugin adds interactive guidance on top of the handbook. Claude will reference the right section automatically when you ask platform engineering questions in your editor, terminal, or browser.
From marketplace:
claude plugin marketplace add https://github.com/nitinjain999/platform-skills
claude plugin install platform-skills
From local clone (for customisation):
git clone https://github.com/nitinjain999/platform-skills.git
cd platform-skills
claude plugin install .
Upgrade to latest version:
claude plugins marketplace update platform-skills
claude plugins remove platform-skills
claude plugins install platform-skills
Install as a Codex skill
Codex discovers skills from the local skills directory. Clone this repository as the skill folder so SKILL.md, agents/openai.yaml, references/, and examples/ stay together:
mkdir -p "${CODEX_HOME:-$HOME/.codex}/skills"
git clone https://github.com/nitinjain999/platform-skills.git "${CODEX_HOME:-$HOME/.codex}/skills/platform-skills"
Then ask Codex naturally:
Use $platform-skills to review this Terraform change for ownership, blast radius, validation, and rollback.
Install as a GitHub Copilot plugin
Install from the Copilot plugin marketplace to get platform-skills guidance in GitHub Copilot Chat:
copilot plugin marketplace add nitinjain999/platform-skills
copilot plugin install platform-skills@platform-skills
Verify:
copilot plugin list
# platform-skills enabled
Upgrade:
copilot plugin uninstall platform-skills
copilot plugin install platform-skills@platform-skills
Install Cursor rules
Copy the Cursor-native rules into your project so every developer gets the same platform guidance in Cursor Chat and Agent:
cp platform-skills/.cursorrules your-project/.cursorrules
mkdir -p your-project/.cursor/rules
cp platform-skills/.cursor/rules/*.mdc your-project/.cursor/rules/
For VSCode, Copilot, Cursor, and JetBrains setup — project level and global level — see EDITOR_INTEGRATIONS.md.
Repository structure
platform-skills/
├── references/ # Deep-dive guides — one per domain
│ ├── platform-operating-model.md
│ ├── kubernetes.md
│ ├── kyverno.md # Kyverno admission policies (v1.11.0)
│ ├── openshift.md
│ ├── argocd.md
│ ├── flux.md
│ ├── aws.md
│ ├── azure.md
│ ├── terraform.md
│ ├── github-actions.md
│ ├── secrets.md
│ ├── linkerd.md
│ ├── linux-networking.md
│ ├── platform-mindset.md
│ ├── compliance.md # SOC 2 controls in Terraform (v1.6.0)
│ ├── helm.md # Helm chart patterns, lint pipeline, values design
│ ├── pr-review.md # PR review: cost, drift, ownership, compliance, upgrade, rollback (v1.12.0)
│ ├── keda.md # KEDA event-driven autoscaling (v1.14.0)
│ ├── karpenter.md # Karpenter EKS node autoscaling (v1.29.0)
│ ├── llm-observability.md # Datadog LLMObs: instrumentation, evals, trace RCA (v1.20.0)
│ └── awesome-docs.md # Animated SVG doc generation — 4 patterns, GitHub-safe CSS (v1.21.0)
│
├── examples/ # Working examples and handbook snippets
│ ├── flux/basic-monorepo/ # Complete Flux CD monorepo structure
│ ├── kubernetes/ # Namespace, deployment, network policy, PDB
│ ├── kyverno/ # ValidatingPolicy, GeneratingPolicy examples + kyverno-cli test manifest (v1.11.0)
│ ├── openshift/ # Route, ResourceQuota, LimitRange
│ ├── argocd/app-of-apps/ # Root application manifest
│ ├── aws/iam/ # Least-privilege IAM policy examples
│ ├── azure/workload-identity/ # Managed identity + federated credential
│ ├── terraform/eks-cluster/ # Production EKS Terraform module
│ ├── github-actions/ # CI/CD, Flux sync, container build workflows
│ ├── helm/web-service/ # Production Helm chart: Deployment, HPA, PDB, NetworkPolicy, schema
│ ├── triage/ # PR comment triage scenarios and fixtures (v1.13.0)
│ ├── keda/ # ScaledObject, ScaledJob, TriggerAuthentication examples (v1.14.0)
│ ├── awesome-docs/ # Animated SVG templates: arch-flow, lifecycle-loop, field-carousel, timeline-phases (v1.21.0)
│ └── compliance/ # SOC 2 Terraform examples (v1.6.0)
│ ├── checkov-config.yaml # Checkov config grouped by SOC 2 criterion
│ ├── .pre-commit-checkov.yaml # Pre-commit hook template
│ ├── checkov-terraform-plan.sh # Plan-mode scan script
│ └── custom-checks/ # Custom check scaffold
│ ├── iam/ # CC6.1/CC6.2: IAM, IRSA, OIDC, SCPs
│ ├── logging/ # CC7.2: CloudTrail, Config, VPC flow logs
│ ├── network/ # CC6.6: WAF, security groups, flow logs
│ ├── encryption-data-services/ # CC6.7: DynamoDB, ECR, ElastiCache, OpenSearch, Kinesis, EFS, Redshift
│ ├── vulnerability/ # CC6.8: Inspector v2, ECR scanning, SSM patching
│ ├── detection/ # CC7.1: GuardDuty, CIS CloudWatch alarms, Security Hub
│ ├── incident-response/ # CC7.3: SNS, EventBridge, PagerDuty
│ └── backup/ # A1.2/A1.3: Backup Plan, vault lock, cross-region DR
│
├── SKILL.md # Agent skill routing and patterns
├── agents/openai.yaml # Codex skill UI metadata
├── .cursorrules # Cursor project-level rules
├── .cursor/rules/ # Cursor scoped file rules
├── .claude-plugin/marketplace.json # Marketplace metadata
├── .github/workflows/ # Validation and release automation
├── tests/validate-skill.sh # Skill structure consistency checks
└── renovate.json # Automated dependency updates
Roadmap
Current release: v1.36.0 — 40 commands, 40 domain reference guides, 50+ wiki pages.
Full version history is in CHANGELOG.md.
Planned
- GCP: landing zone, GKE, Workload Identity, and IAM patterns
- Istio: traffic management, mTLS, telemetry (counterpart to Linkerd domain)
- SOC 2 for Kubernetes: Kyverno policies mapped to TSC criteria, pod security admission,
kube-benchCIS Benchmark integration - OpenShift operator lifecycle: OLM, CatalogSource, operator upgrade patterns
- Argo CD ApplicationSet fleet patterns: cluster generators, matrix strategies, progressive rollout
- Multi-cloud networking: Transit Gateway, VNet peering, PrivateLink, cross-cloud DNS
Contributing
See CONTRIBUTING.md for how to propose new patterns, the development workflow, and release guidelines.
Related resources
- AWS EKS Best Practices Guide
- Argo CD documentation
- Flux CD documentation
- GitHub Actions security hardening
Sponsor
If Platform Skills saves you time, consider sponsoring to help keep it maintained and growing.
Every sponsor directly supports new domains, pattern updates, and the time spent validating every example in real environments.
Contributors ✨
Thanks goes to these wonderful people (emoji key):
Nitin Jain 💻 📖 🚧 |
geetika-sv 💻 📖 |
This project follows the all-contributors specification. Contributions of any kind welcome!
Star History
License
Apache-2.0. See LICENSE for the full text and NOTICE for attribution.
If you create derivative works based on this project, retain the Apache 2.0 license text, existing copyright and attribution notices, and clearly mark any files you changed.
Support
Yorumlar (0)
Yorum birakmak icin giris yap.
Yorum birakSonuc bulunamadi