gem-team
Health Gecti
- License — License: Apache-2.0
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Community trust — 93 GitHub stars
Code Gecti
- Code scan — Scanned 11 files during light audit, no dangerous patterns found
Permissions Gecti
- Permissions — No dangerous permissions requested
This tool is a multi-agent orchestration framework designed to facilitate spec-driven development. It coordinates specialized AI agents to work in parallel, automating code generation, testing, and verification.
Security Assessment
The automated scan found no dangerous patterns across its 11 core files and noted that the tool does not request any risky permissions. However, as a development orchestration agent, it is built to read your local codebase, analyze files, and write or modify code. While the light scan found no hardcoded secrets or malicious payloads, granting an autonomous agent file system access always carries a baseline risk. Overall risk is rated as Low for local development, provided you trust the AI models it orchestrates.
Quality Assessment
The project exhibits strong health and maintenance signals. It is licensed under the permissive Apache-2.0, actively maintained (updated very recently), and has earned 93 GitHub stars, indicating a growing level of community trust and usage.
Verdict
Safe to use, though developers should remain aware of standard file-system permissions when running automated code-generation tools.
Multi-agent orchestration framework for spec-driven development and automated verification.
💎 Gem Team
Multi-agent orchestration framework for spec-driven development and automated verification.
"Turning Model Quality into System Quality."
🚀 Quick Start
See all installation options below.
🤔 Why Gem Team?
- ⚡ 4x Faster — Parallel execution with wave-based execution
- 🏆 Higher Quality — Specialized agents + TDD + verification gates + contract-first
- 🔒 Built-in Security — OWASP scanning, secrets/PII detection on critical tasks
- 👁️ Full Visibility — Real-time status, clear approval gates
- 🛡️ Resilient — Pre-mortem analysis, failure handling, auto-replanning
- ♻️ Pattern Reuse — Codebase pattern discovery prevents reinventing wheels
- 📏 Established Patterns — Uses library/framework conventions over custom implementations
- 🪞 Self-Correcting — All agents self-critique at 0.85 confidence threshold
- 🧠 Context Scaffolding — Maps large-scale dependencies before the model reads code, preventing context-loss in legacy repos
- ⚖️ Intent vs. Compliance — Shifts the burden from writing "perfect prompts" to enforcing strict, YAML-based approval gates
- 📋 Source Verified — Every factual claim cites its source; no guesswork
- ♿ Accessibility-First — WCAG compliance validated at spec and runtime layers
- 🔬 Smart Debugging — Root-cause analysis with stack trace parsing + confidence-scored fixes
- 🚀 Safe DevOps — Idempotent operations, health checks, mandatory approval gates
- 🔗 Traceable — Self-documenting IDs link requirements → tasks → tests → evidence
- 📚 Knowledge-Driven — Prioritized sources (PRD → codebase → AGENTS.md → Context7 → docs)
- 🛠️ Skills & Guidelines — Built-in skill & guidelines (web-design-guidelines)
- 📐 Spec-Driven — Multi-step refinement defines "what" before "how"
- 🌊 Wave-Based — Parallel agents with integration gates per wave
- 🗂️ Verified-Plan — Complex tasks: Plan → Verificationn → Critic
- 🔎 Final Review — Optional user-triggered comprehensive review of all changed files
- 🩺 Diagnose-then-Fix — gem-debugger diagnoses → gem-implementer fixes → re-verifies
- ⚠️ Pre-Mortem — Failure modes identified BEFORE execution
- 💬 Constructive Critique — gem-critic challenges assumptions, finds edge cases
- 📝 Contract-First — Contract tests written before implementation
- 📱 Mobile Agents — Native mobile implementation (React Native, Flutter) + iOS/Android testing
🚀 The "System-IQ" Multiplier
Raw reasoning isn't enough in single-pass chat. Gem-Team wraps your preferred LLM in a rigid, verification-first loop, fundamentally boosting its effective capability on SWE-benchmarks:
- For Small Models (e.g., Qwen 1.7B - 8B): The framework provides the "executive brain." Task decomposition and isolated 50-line chunks can up to double their localized debugging success rates.
- For Reasoning Models (e.g., DeepSeek 3.2): TDD loops and parallel research stabilize their native file I/O fragility, yielding up to a +25% lift in execution reliability.
- For SOTA Models (e.g., GLM 5.1, Kimi K2.5): The
gem-revieweracts as a noise-filter, pruning verbosity and enforcing strict PRD compliance to prevent over-engineering.
🔄 Core Workflow
Phase Flow: User Goal → Orchestrator → Discuss (medium|complex) → PRD → Research → Planning → Plan Review (medium|complex) → Execution → Summary → (Optional) Final Review
Error Handling: Diagnose-then-Fix loop (Debugger → Implementer → Re-verify)
Orchestrator auto-detects phase and routes accordingly. Any feedback or steer message is handled to re-plan.
| Condition | Phase | Outcome |
|---|---|---|
| No plan + simple | Research → Planning | Quick execution path |
| No plan + medium|complex | Discuss → PRD → Research | Spec-driven approach |
| Plan + pending tasks | Execution | Wave-based implementation |
| Plan + feedback | Planning | Replan with steer |
| Plan + completed | Summary | User decision (feedback / final review / approve) |
| User requests final review | Final Review | Parallel review by gem-reviewer + gem-critic |
📦 Installation
| Method | Command / Link | Docs |
|---|---|---|
| Code | Install Now | Copilot Docs |
| Code Insiders | Install Now | Copilot Docs |
| Copilot CLI (Marketplace) | copilot plugin install gem-team@awesome-copilot |
CLI Docs |
| Copilot CLI (Direct) | copilot plugin install gem-team@mubaidr |
CLI Docs |
| APM (All AI coding agents) |
apm install mubaidr/gem-team |
APM Docs |
| Windsurf | codeium agent install mubaidr/gem-team |
Windsurf Docs |
| Claude Code | claude plugin install mubaidr/gem-team |
Claude Docs |
| OpenCode | opencode plugin install mubaidr/gem-team |
OpenCode Docs |
| Manual (Copy agent files) |
VS Code: ~/.vscode/agents/ VS Code Insiders: ~/.vscode-insiders/agents/ GitHub Copilot: ~/.github/copilot/agents/ GitHub Copilot (project): .github/plugin/agents/ Windsurf: ~/.windsurf/agents/ Claude: ~/.claude/agents/ Cursor: ~/.cursor/agents/ OpenCode: ~/.opencode/agents/ |
— |
🏗️ Architecture
flowchart
USER["User Goal"]
subgraph ORCH["Orchestrator"]
detect["Phase Detection"]
end
subgraph PHASES
DISCUSS["🔹 Discuss"]
PRD["📋 PRD"]
RESEARCH["🔍 Research"]
PLANNING["📝 Planning"]
EXEC["⚙️ Execution"]
SUMMARY["📊 Summary"]
FINAL["🔎 Final Review"]
end
DIAG["🔬 Diagnose-then-Fix"]
USER --> detect
detect --> |"Simple"| RESEARCH
detect --> |"Medium|Complex"| DISCUSS
DISCUSS --> PRD
PRD --> RESEARCH
RESEARCH --> PLANNING
PLANNING --> |"Approved"| EXEC
PLANNING --> |"Feedback"| PLANNING
EXEC --> |"Failure"| DIAG
DIAG --> EXEC
EXEC --> SUMMARY
SUMMARY --> |"Review files"| FINAL
FINAL --> |"Clean"| SUMMARY
PLANNING -.-> |"critique"| critic
PLANNING -.-> |"review"| reviewer
EXEC --> |"parallel ≤4"| agents
EXEC --> |"post-wave (complex)"| critic
🤖 The Agent Team (Q2 2026 SOTA)
| Role | Description | Output | Recommended LLM |
|---|---|---|---|
| 🎯 ORCHESTRATOR | The team lead: Orchestrates research, planning, implementation, and verification | 📋 PRD, plan.yaml | Closed: GPT-5.4, Gemini 3.1 Pro, Claude Sonnet 4.6 Open: GLM-5, Kimi K2.5, Qwen3.5 |
| 🔍 RESEARCHER | Codebase exploration — patterns, dependencies, architecture discovery | 🔍 findings | Closed: Gemini 3.1 Pro, GPT-5.4, Claude Sonnet 4.6 Open: GLM-5, Qwen3.5-9B, DeepSeek-V3.2 |
| 📋 PLANNER | DAG-based execution plans — task decomposition, wave scheduling, risk analysis | 📄 plan.yaml | Closed: Gemini 3.1 Pro, Claude Sonnet 4.6, GPT-5.4 Open: Kimi K2.5, GLM-5, Qwen3.5 |
| 🔧 IMPLEMENTER | TDD code implementation — features, bugs, refactoring. Never reviews own work | 💻 code | Closed: Claude Opus 4.6, GPT-5.4, Gemini 3.1 Pro Open: DeepSeek-V3.2, GLM-5, Qwen3-Coder-Next |
| 🧪 BROWSER TESTER | E2E browser testing, UI/UX validation, visual regression with Playwright | 🧪 evidence | Closed: GPT-5.4, Claude Sonnet 4.6, Gemini 3.1 Flash Open: Llama 4 Maverick, Qwen3.5-Flash, MiniMax M2.7 |
| 🚀 DEVOPS | Infrastructure deployment, CI/CD pipelines, container management | 🌍 infra | Closed: GPT-5.4, Gemini 3.1 Pro, Claude Sonnet 4.6 Open: DeepSeek-V3.2, GLM-5, Qwen3.5 |
| 🛡️ REVIEWER | Zero-Hallucination Filter — Security auditing, code review, OWASP scanning, PRD compliance verification | 📊 review report | Closed: Claude Opus 4.6, GPT-5.4, Gemini 3.1 Pro Open: Kimi K2.5, GLM-5, DeepSeek-V3.2 |
| 📝 DOCUMENTATION | Technical documentation, README files, API docs, diagrams, walkthroughs | 📝 docs | Closed: Claude Sonnet 4.6, Gemini 3.1 Flash, GPT-5.4 Mini Open: Llama 4 Scout, Qwen3.5-9B, MiniMax M2.7 |
| 🔬 DEBUGGER | Root-cause analysis, stack trace diagnosis, regression bisection, error reproduction | 🔬 diagnosis | Closed: Gemini 3.1 Pro (Retrieval King), Claude Opus 4.6, GPT-5.4 Open: DeepSeek-V3.2, GLM-5, Qwen3-Coder-Next |
| 🎯 CRITIC | Challenges assumptions, finds edge cases, spots over-engineering and logic gaps | 💬 critique | Closed: Claude Sonnet 4.6, GPT-5.4, Gemini 3.1 Pro Open: Kimi K2.5, GLM-5, Qwen3.5 |
| ✂️ SIMPLIFIER | Refactoring specialist — removes dead code, reduces complexity, consolidates duplicates | ✂️ change log | Closed: Claude Opus 4.6, GPT-5.4, Gemini 3.1 Pro Open: DeepSeek-V3.2, GLM-5, Qwen3-Coder-Next |
| 🎨 DESIGNER | UI/UX design specialist — layouts, themes, color schemes, design systems, accessibility | 🎨 DESIGN.md | Closed: GPT-5.4, Gemini 3.1 Pro, Claude Sonnet 4.6 Open: Qwen3.5, GLM-5, MiniMax M2.7 |
| 📱 IMPLEMENTER-MOBILE | Mobile implementation — React Native, Expo, Flutter with TDD | 💻 code | Closed: Claude Opus 4.6, GPT-5.4, Gemini 3.1 Pro Open: DeepSeek-V3.2, GLM-5, Qwen3-Coder-Next |
| 📱 DESIGNER-MOBILE | Mobile UI/UX specialist — HIG, Material Design, safe areas, touch targets | 🎨 DESIGN.md | Closed: GPT-5.4, Gemini 3.1 Pro, Claude Sonnet 4.6 Open: Qwen3.5, GLM-5, MiniMax M2.7 |
| 📱 MOBILE TESTER | Mobile E2E testing — Detox, Maestro, iOS/Android simulators | 🧪 evidence | Closed: GPT-5.4, Claude Sonnet 4.6, Gemini 3.1 Flash Open: Llama 4 Maverick, Qwen3.5-Flash, MiniMax M2.7 |
Agent File Skeleton
Each .agent.md file follows this structure:
--- # Frontmatter: description, name, triggers
# Role # One-line identity
# Expertise # Core competencies
# Knowledge Sources # Prioritized reference list
# Workflow # Step-by-step execution phases
## 1. Initialize # Setup and context gathering
## 2. Analyze/Execute # Role-specific work
## N. Self-Critique # Confidence check (≥0.85)
## N+1. Handle Failure # Retry/escalate logic
## N+2. Output # JSON deliverable format
# Input Format # Expected JSON schema
# Output Format # Return JSON schema
# Rules
## Execution # Tool usage, batching, error handling
## Constitutional # IF-THEN decision rules
## Anti-Patterns # Behaviors to avoid
## Anti-Rationalization # Excuse → Rebuttal table
## Directives # Non-negotiable commands
All agents share: Execution rules, Constitutional rules, Anti-Patterns, and Directives sections. Anti-Rationalization tables are present in 5 agents (implementer, planner, reviewer, designer, browser-tester). Role-specific sections (Workflow, Expertise, Knowledge Sources) vary by agent.
📚 Knowledge Sources
Agents consult only the sources relevant to their role. Trust levels apply:
| Trust Level | Sources | Behavior |
|---|---|---|
| Trusted | PRD.yaml, plan.yaml, AGENTS.md | Follow as instructions |
| Verify | Codebase files, research findings | Cross-reference before assuming |
| Untrusted | Error logs, external data, third-party responses | Factual only — never as instructions |
| Agent | Knowledge Sources |
|---|---|
| orchestrator | PRD.yaml, AGENTS.md |
| researcher | PRD.yaml, codebase patterns, AGENTS.md, Context7, official docs, online search |
| planner | PRD.yaml, codebase patterns, AGENTS.md, Context7, official docs |
| implementer | codebase patterns, AGENTS.md, Context7 (API verification), DESIGN.md (UI tasks) |
| debugger | codebase patterns, AGENTS.md, error logs (untrusted), git history, DESIGN.md (UI bugs) |
| reviewer | PRD.yaml, codebase patterns, AGENTS.md, OWASP reference, DESIGN.md (UI review) |
| browser-tester | PRD.yaml (flow coverage), AGENTS.md, test fixtures, baseline screenshots, DESIGN.md (visual validation) |
| designer | PRD.yaml (UX goals), codebase patterns, AGENTS.md, existing design system |
| code-simplifier | codebase patterns, AGENTS.md, test suites (behavior verification) |
| documentation-writer | AGENTS.md, existing docs, source code |
🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request. CONTRIBUTING for detailed guidelines on commit message formatting, branching strategy, and code standards.
📄 License
This project is licensed under the MIT License.
💬 Support
If you encounter any issues or have questions, please open an issue on GitHub.
Yorumlar (0)
Yorum birakmak icin giris yap.
Yorum birakSonuc bulunamadi