gossipcat-ai
Health Uyari
- License — License: MIT
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Low visibility — Only 5 GitHub stars
Code Basarisiz
- fs module — File system access in apps/cli/src/chat.ts
- child_process — Shell command execution capability in apps/cli/src/clipboard.js
- fs module — File system access in apps/cli/src/config.js
Permissions Gecti
- Permissions — No dangerous permissions requested
This tool is an MCP server that orchestrates multiple AI agents from different providers to review your code in parallel, cross-check their findings, and prioritize agents with the best accuracy over time.
Security Assessment
The overall security risk is Medium. The primary concern is a failed security check involving shell command execution capabilities located in a clipboard script. While this might be intended for copying text to a user's clipboard, executing system commands inherently increases the risk profile and must be carefully vetted. The tool also features file system access for managing configurations and chat histories, which is standard but requires local permission awareness. No hardcoded secrets or overly broad permissions were detected. However, as a framework designed to dispatch code and data to external AI providers, it inherently makes external network requests to third-party APIs, meaning your codebase will leave your local environment.
Quality Assessment
The project is very new and currently has low visibility with only 5 GitHub stars, indicating minimal community trust and testing so far. It is actively maintained, evidenced by recent pushes and a clear, detailed description. It is properly open-source, distributed under the standard and permissive MIT license.
Verdict
Use with caution — the concept is useful, but the low community adoption and shell execution capabilities mean you should review the codebase yourself before integrating it into sensitive environments.
Multi-agent code review mesh — orchestrates AI agents from multiple providers to review code in parallel, cross-review each other's findings, and build accuracy profiles over time. Agents that catch real bugs get picked more often. Agents that hallucinate get deprioritized. MCP server for Claude Code, Cursor, and other IDEs.
agentic orchestration framework — agents that learn, adapt, and get better every round.
Quickstart · How It Works · Usage · For AI Agents · Dashboard · Configuration · Roadmap
What is Gossipcat?
Gossipcat is an MCP server that orchestrates multiple AI agents to review your code in parallel. Agents independently review, then cross-review each other's findings. Agreements are confirmed. Hallucinations are caught and penalized. Over time, each agent builds an accuracy profile — the system learns who to trust for what.
Why multi-agent?
| Without gossipcat | With gossipcat |
|---|---|
| One AI reviews your code — and hallucinates a finding you waste 20 minutes on | Multiple agents cross-check each other — hallucinations get caught before you see them |
| Every agent gets the same tasks regardless of track record | Dispatch weights route tasks to the agent with the best accuracy in that category |
| An agent keeps making the same class of mistake | Skill files are auto-generated from failure data and injected into future prompts |
| You don't know which agent to trust | Accuracy, uniqueness, and reliability scores are tracked per agent, per category |
Gossipcat is right for you if
- You want multiple AI models catching different classes of bugs
- You don't trust a single agent to catch everything
- You want agents to cross-check each other's findings before you act on them
- You want to know which agents are actually accurate vs. hallucinating
- You want agents that get better over time based on their track record
Features
Consensus Review3+ agents review independently, then cross-review each other. Findings tagged as CONFIRMED, DISPUTED, or UNIQUE. |
Adaptive DispatchAgent accuracy is tracked per-category. Dispatch weights adjust automatically — the best agent for the job gets picked. |
Skill DevelopmentWhen an agent keeps failing in a category, targeted skills are generated from failure data and injected into future prompts. |
Multi-ProviderMix Anthropic, Google, OpenAI, and OpenClaw agents in one team. Each brings different strengths. Native agents need no API key. 🦞 Lobster friendly. |
Live DashboardReal-time view of tasks, consensus reports, agent scores, and activity feed. Terminal Amber theme. WebSocket updates. |
Agent MemoryPer-agent cognitive memory persists across sessions. Agents remember past findings, patterns, and project context. |
| Works with |
Full support |
Cursor Not yet |
Windsurf Not yet |
VS Code Not yet |
| Provider gateways |
HTTP gateway ✅ |
Local models ✅ |
Any base_url ✅ |
How it works
dispatch ──→ parallel review ──→ cross-review ──→ consensus
│
┌─────┴─────┐
▼ ▼
signals skill development
│ │
▼ ▼
dispatch weights targeted prompts
(who gets picked) (agent improves)
| Step | What happens |
|---|---|
| Dispatch | Tasks routed to agents based on dispatch weights (accuracy history per category) |
| Parallel review | Agents work independently, each producing findings with confidence scores |
| Cross-review | Each agent reviews peers' findings: agree, disagree, unverified, or new finding |
| Consensus | Findings deduplicated and tagged: CONFIRMED, DISPUTED, UNVERIFIED, UNIQUE |
| Signals | You verify findings against code and record accuracy signals |
| Skill development | Agents with repeated failures get targeted skill files injected into future prompts |
Two types of agents
| Native | Relay | |
|---|---|---|
| Runs as | Claude Code subagent (Agent() tool) |
WebSocket worker on relay server |
| Providers | Anthropic (Claude) | Google (Gemini), OpenAI, any provider |
| API key | None — uses your Claude Code subscription | Required per provider |
| Defined in | .claude/agents/*.md |
.gossip/config.json |
| Consensus | Yes | Yes |
| Memory & Skills | Yes | Yes |
Both types participate equally in consensus, cross-review, and skill development.
Quickstart
Requirements: Node.js 22+
1. Clone and build
git clone https://github.com/ataberk-xyz/gossipcat-ai.git
cd gossipcat-ai
npm install
npm run build:mcp
npm install generates .mcp.json with the correct paths for your machine. build:mcp bundles the MCP server. Open Claude Code in this directory and gossipcat connects automatically.
To register globally (available in all projects):
claude mcp add gossipcat -s user -- node /absolute/path/to/gossipcat-ai/dist-mcp/mcp-server.js
2. Build the dashboard (optional)
npm run build:dashboard
Launches automatically on port 24420. Skip this if you don't need the visual dashboard.
3. API keys
Add env vars for the providers you want to use. Pass them with -e when registering, or set them in your shell environment.
| Provider | Env var | Notes |
|---|---|---|
| Google Gemini | GOOGLE_API_KEY |
For Gemini relay agents |
| OpenAI | OPENAI_API_KEY |
For OpenAI relay agents |
| Anthropic | — | Native agents use your Claude Code subscription — no key needed |
Example with Gemini:
claude mcp add gossipcat -s user -e GOOGLE_API_KEY=your-key -- node /path/to/gossipcat/dist-mcp/mcp-server.js
Keys are stored persistently and cross-platform:
- macOS — OS Keychain
- Linux — Secret Service (
secret-tool) - Windows / other — AES-256-GCM encrypted file
4. Initialize your team
Start a Claude Code session in any project and ask Claude to set up your team:
"Set up a gossipcat team with a Gemini reviewer and a Sonnet implementer"
Claude Code calls gossip_setup() to create your .gossip/config.json and agent definitions. You choose the providers, models, and roles — gossipcat adapts to your setup.
Available presets: reviewer, implementer, tester, researcher, debugger, architect, security, designer, planner, devops, documenter
Use Cases
Build something — gossipcat picks the team
"I want to build a Stripe integration, set up a team for that"
"I'm adding real-time notifications — what agents do I need?"
"Set up a team for a TypeScript REST API project"
Describe what you're building. Gossipcat proposes an agent team tailored to the task — right presets, right skills, right mix of providers. You review the proposal and approve it. From that point on, agents dispatch automatically based on what your code touches.
Review code before committing
"Review the changes I just made"
"Do a consensus review on the auth module"
"Check my last 3 commits for bugs"
Three agents review your diff independently, then cross-check each other's findings. You get a report with CONFIRMED bugs (multiple agents agree), DISPUTED findings (agents disagree), and UNIQUE findings (only one agent found it). You only act on what's verified.
Catch security issues
"Security audit the payment handler"
"Check the login flow for vulnerabilities"
"Review the API endpoints for injection risks"
Dispatch your security-focused agents in parallel. Each reviews from a different angle — one checks OWASP vectors, another checks input validation, another checks auth logic. Findings that survive cross-review are real.
Research a codebase before building
"Research how the WebSocket connection lifecycle works before I touch it"
"Explain the dispatch pipeline — I need to add a new routing mode"
Agents read the code, trace call paths, and write a summary back to session memory. Next time you ask about the same area, they already know it.
Get a second opinion on your own review
"I think there's a race condition in this Map — check if I'm right"
"Verify whether this fix actually resolves the issue"
Describe what you think you're seeing. Agents check independently and either confirm or disprove it. Author self-review is optimistic by nature — this isn't.
Track which agents are actually reliable
"Show me agent scores"
"Which agent is best at security reviews?"
Every finding gets verified and turned into a signal. Accuracy, uniqueness, and reliability are tracked per agent. Over time, dispatch weights shift — the agents that keep catching real bugs get more work.
Improve a struggling agent
"Gemini keeps hallucinating about concurrency — fix it"
"Develop a skill for the reviewer's repeated type-safety misses"
Gossipcat generates a targeted skill file from the agent's failure data and injects it into future prompts. Signals penalize past mistakes; skills prevent future ones.
Usage
Once gossipcat is installed, you interact with it through natural language in Claude Code. The CLAUDE.md rules file (auto-generated on first boot) teaches Claude Code how to use the gossipcat tools — you just describe what you want.
What to say to Claude Code
| What you want | What to type |
|---|---|
| Review your latest changes | "Review my recent changes" |
| Deep review of critical code | "Do a consensus review on the auth module" |
| Catch security issues | "Security audit the payment handler" |
| Research before building | "How does the dispatch pipeline work?" |
| Get a second opinion | "Check if I'm right about this race condition" |
| Check which agents are performing well | "Show me agent scores" |
| Improve a struggling agent | "Develop a skill for the reviewer's type-safety misses" |
| Save context for next session | "Save session" |
Claude Code reads the dispatch rules from .claude/rules/gossipcat.md and automatically decides whether to use single-agent, parallel, or consensus mode based on what your change touches.
Example session
You: "Review the changes I made to the relay server with the gossipcat team"
Claude: Dispatches 3 agents via gossip_dispatch(mode: "consensus")
→ sonnet-reviewer checks for security issues
→ gemini-reviewer checks for logic bugs
→ gemini-tester checks for edge cases
Cross-review round: agents review each other's findings
Consensus report:
✓ CONFIRMED: race condition in connection cleanup (3/3 agree)
✓ CONFIRMED: missing error handler on WebSocket close (2/3 agree)
? UNVERIFIED: potential memory leak in Map (1 found, others couldn't verify)
Claude verifies the UNVERIFIED finding against your code,
records accuracy signals, and presents the final report.
Under the hood
Claude Code translates your requests into gossipcat MCP tool calls:
gossip_run(agent_id: "auto", task: "...") → single-agent task
gossip_dispatch(mode: "consensus", tasks: [...]) → multi-agent review
gossip_collect(consensus: true) → cross-review + report
gossip_signals(action: "record", signals: [...]) → record accuracy
gossip_scores() → view agent performance
gossip_skills(action: "develop", ...) → improve struggling agents
You don't need to type these — Claude Code handles tool selection. But you can call them directly if you want fine-grained control.
For AI Agents
Currently supported orchestrator: Claude Code only. Cursor, Windsurf, and VS Code agents do not yet have native agent support and cannot run consensus cross-review. If you are reading this as a Claude Code instance with the gossipcat MCP server connected, this guide applies to you.
If you are a Claude Code instance reading this to understand how to use gossipcat, here is the complete operational guide.
You are the orchestrator. Gossipcat gives you tools to dispatch work to a team of specialized agents, collect their results, run cross-review consensus, and record accuracy signals. You do not implement code directly — you coordinate.
Step 1 — Check what you have
gossip_status()
This tells you: which agents are available, their providers/models, relay status, dashboard URL, and suggested workflows. Always call this first in a new session.
Step 2 — Dispatch work
Single agent (auto-selected):
gossip_run(agent_id: "auto", task: "Review the auth module for security issues")
The system picks the best agent for the task based on past accuracy. For relay agents, this returns the result directly. For native agents (Claude Code subagents), it returns ⚠️ EXECUTE NOW instructions — follow them immediately in the same response.
Parallel (fan-out, no consensus):
gossip_dispatch(mode: "parallel", tasks: [
{ agent_id: "gemini-reviewer", task: "..." },
{ agent_id: "sonnet-reviewer", task: "..." }
])
Consensus (cross-review):
gossip_dispatch(mode: "consensus", tasks: [
{ agent_id: "gemini-reviewer", task: "..." },
{ agent_id: "sonnet-reviewer", task: "..." },
{ agent_id: "haiku-researcher", task: "..." }
])
Step 3 — Collect results
gossip_collect(task_ids: ["id1", "id2", "id3"], consensus: true)
With consensus: true, agents cross-review each other's findings. If native agents are in the round, gossip_collect returns ⚠️ EXECUTE NOW with prompts — dispatch those Agent() calls immediately, then relay each result via gossip_relay_cross_review.
Step 4 — Verify and record signals
After consensus, verify every UNVERIFIED finding against the actual code (grep/read the cited files). Then record signals:
gossip_signals(action: "record", signals: [{
signal: "unique_confirmed", // or "hallucination_caught", "agreement"
agent_id: "gemini-reviewer",
finding: "Race condition in task map at line 47",
finding_id: "<consensus_id>:<agent_id>:f1" // mandatory
}])
Signals update dispatch weights. Agents that hallucinate get penalized. Agents that catch real bugs get promoted.
Key rules
- Always follow
⚠️ EXECUTE NOW— dispatch thoseAgent()calls in the same response, do not wait. - Never leave UNVERIFIED findings unexamined — read the code, confirm or deny, record the signal.
finding_idis mandatory on every signal — format:<consensus_id>:<agent_id>:fN.- Use
gossip_progressafter reconnect — if a consensus round was in flight, it re-surfaces the pending EXECUTE NOW prompts.
When to use consensus
Use gossip_dispatch(mode: "consensus") when the change touches: shared mutable state, auth/sessions, file persistence, or the core dispatch pipeline. Use gossip_run for single-agent research, exploration, or review tasks that don't need cross-validation.
MCP Tools
These tools are called by the internal LLM (the orchestrator — Claude Code with gossipcat MCP). You don't invoke them manually; the orchestrator selects and calls them based on your requests.
| Tool | Purpose |
|---|---|
gossip_status |
System status, dashboard URL, agent list |
gossip_run |
Single-agent dispatch with auto agent selection |
gossip_dispatch |
Multi-agent dispatch: single, parallel, or consensus |
gossip_collect |
Collect results with optional cross-review synthesis |
gossip_relay |
Feed native agent results back into the pipeline |
gossip_relay_cross_review |
Feed native cross-review results into consensus |
gossip_plan |
Decompose task into sub-tasks with agent assignments |
gossip_signals |
Record or retract accuracy signals |
gossip_scores |
View agent accuracy, uniqueness, and dispatch weights |
gossip_skills |
Develop, bind, unbind, or list per-agent skills |
gossip_setup |
Create or update agent team |
gossip_session_save |
Save session context for next session |
gossip_remember |
Search an agent's cognitive memory |
gossip_progress |
Check in-progress task status |
gossip_tools |
List all available tools |
gossip_update |
Check for or apply gossipcat updates from npm |
Dashboard
Build the dashboard (one time):
npm run build:dashboard
The dashboard launches automatically on port 24420 when gossipcat boots. Run gossip_status to get the URL and auth key:
Dashboard: http://localhost:24420/dashboard (key: a1b2c3...)
A new auth key is generated each session. Paste it when prompted to log in.
Built with React + Vite + shadcn/ui:
- Overview — agent cards with dispatch weights, recent tasks, finding metrics
- Team — all agents sorted by reliability
- Tasks — task history with agent, duration, and status
- Findings — consensus reports with CONFIRMED/DISPUTED/UNVERIFIED breakdowns
- Agent detail — per-agent memory, skills, scores, and task history
Live updates via WebSocket — every tool call pushes events to connected clients.
Architecture
gossipcat/
apps/
cli/ MCP server, native agent bridge, boot sequence
packages/
orchestrator/ Dispatch pipeline, consensus engine, memory, skills,
performance scoring, task graph, prompt assembly
relay/ WebSocket relay server, dashboard REST/WS API
dashboard-v2/ React + Vite frontend (Terminal Amber theme)
client/ Lightweight WebSocket client for relay connections
tools/ File/shell/git tool implementations for worker agents
types/ Shared TypeScript types and message protocol
OpenClaw Integration
Gossipcat supports OpenClaw as a provider gateway. OpenClaw runs locally and exposes an OpenAI-compatible HTTP API — gossipcat talks to it like any other relay agent, with your stored gateway token and a separate quota slot so OpenClaw rate limits never bleed into your OpenAI agents.
Wiring an OpenClaw agent
Store your gateway token once (macOS):
security add-generic-password -s gossip-mesh -a openclaw -w <your-gateway-token>
On Linux:
secret-tool store --label "Gossip Mesh openclaw" service gossip-mesh provider openclaw
# (enter token when prompted)
Then add it to your team:
"Add an OpenClaw reviewer to my team"
Or directly via gossip_setup:
gossip_setup(mode: "merge", agents: [{
id: "openclaw-agent",
type: "custom",
provider: "openclaw",
custom_model: "openclaw/default",
role: "reviewer",
skills: ["code_review", "typescript"]
}])
The gateway runs at http://127.0.0.1:18789/v1 by default. Override with base_url if yours is on a different port. Available models: openclaw, openclaw/default, openclaw/main.
Once added, the agent participates in consensus rounds, accumulates accuracy signals, and gets skill files generated from its failure patterns — same as any other agent in the mesh.
Configuration
Config is searched in order: .gossip/config.json > gossip.agents.json > gossip.agents.yaml.
{
"main_agent": {
"provider": "google",
"model": "gemini-2.5-pro"
},
"utility_model": {
"provider": "native",
"model": "haiku"
},
"consensus_judge": {
"provider": "anthropic",
"model": "claude-sonnet-4-6",
"native": true
},
"agents": {
"sonnet-reviewer": {
"provider": "anthropic",
"model": "claude-sonnet-4-6",
"preset": "reviewer",
"skills": ["code_review", "security_audit", "typescript"],
"native": true
}
}
}
| Field | Description |
|---|---|
main_agent |
Internal tool LLM for routing, planning, and synthesis |
utility_model |
Memory compaction, gossip, lens generation |
consensus_judge |
Model for cross-review synthesis |
agents.<id>.provider |
anthropic, google, openai, openclaw, local |
agents.<id>.base_url |
Custom endpoint for openai/openclaw (e.g. http://127.0.0.1:18789/v1) |
agents.<id>.native |
true = runs via Claude Code Agent(), no API key |
agents.<id>.preset |
reviewer, implementer, tester, researcher, debugger, architect, security, designer, planner, devops, documenter |
agents.<id>.skills |
Skill labels for dispatch matching |
Host compatibility
Gossipcat auto-detects the host environment:
| Host | Native agents | Rules file |
|---|---|---|
| Claude Code | Yes | .claude/rules/gossipcat.md |
| Cursor | No | .cursor/rules/gossipcat.mdc |
| Windsurf | No | .windsurfrules |
| VS Code | No | — |
Roadmap
| Feature | Status |
|---|---|
| Consensus code review | ✅ Shipped |
| Adaptive dispatch weights | ✅ Shipped |
| Per-agent skill development | ✅ Shipped |
| Agent cognitive memory | ✅ Shipped |
| Live dashboard | ✅ Shipped |
| Cross-platform key storage | ✅ Shipped |
OpenAI-compatible gateway support (base_url) |
✅ Shipped |
| OpenClaw provider integration 🦞 | ✅ Shipped |
| Full implementation workflow (agents write code) | 🔄 In progress |
| Dashboard enrichment (graphs, trends, session history) | ☐ Planned |
| Local Postgres migration (embedded Postgres for tasks/signals/consensus/memory — unblocks full task results, real queries, no more JSONL scans) | ☐ Planned |
| Local LLM support (Ollama) | ☐ Planned |
| Full Cursor support | ☐ Planned |
| Windsurf / VS Code parity | ☐ Planned |
| Standalone CLI (no IDE required) | ☐ Planned |
| CLI parity with MCP pipeline (gossip, task graph, agent memory in chat mode) | ☐ Planned |
Star History
License
Yorumlar (0)
Yorum birakmak icin giris yap.
Yorum birakSonuc bulunamadi