gossipcat-ai

agent
Guvenlik Denetimi
Basarisiz
Health Uyari
  • License — License: MIT
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Low visibility — Only 5 GitHub stars
Code Basarisiz
  • fs module — File system access in apps/cli/src/chat.ts
  • child_process — Shell command execution capability in apps/cli/src/clipboard.js
  • fs module — File system access in apps/cli/src/config.js
Permissions Gecti
  • Permissions — No dangerous permissions requested
Purpose
This tool is an MCP server that orchestrates multiple AI agents from different providers to review your code in parallel, cross-check their findings, and prioritize agents with the best accuracy over time.

Security Assessment
The overall security risk is Medium. The primary concern is a failed security check involving shell command execution capabilities located in a clipboard script. While this might be intended for copying text to a user's clipboard, executing system commands inherently increases the risk profile and must be carefully vetted. The tool also features file system access for managing configurations and chat histories, which is standard but requires local permission awareness. No hardcoded secrets or overly broad permissions were detected. However, as a framework designed to dispatch code and data to external AI providers, it inherently makes external network requests to third-party APIs, meaning your codebase will leave your local environment.

Quality Assessment
The project is very new and currently has low visibility with only 5 GitHub stars, indicating minimal community trust and testing so far. It is actively maintained, evidenced by recent pushes and a clear, detailed description. It is properly open-source, distributed under the standard and permissive MIT license.

Verdict
Use with caution — the concept is useful, but the low community adoption and shell execution capabilities mean you should review the codebase yourself before integrating it into sensitive environments.
SUMMARY

Multi-agent code review mesh — orchestrates AI agents from multiple providers to review code in parallel, cross-review each other's findings, and build accuracy profiles over time. Agents that catch real bugs get picked more often. Agents that hallucinate get deprioritized. MCP server for Claude Code, Cursor, and other IDEs.

README.md

Gossipcat

agentic orchestration framework — agents that learn, adapt, and get better every round.

MIT License Node 22+

Quickstart · How It Works · Usage · For AI Agents · Dashboard · Configuration · Roadmap


What is Gossipcat?

Gossipcat is an MCP server that orchestrates multiple AI agents to review your code in parallel. Agents independently review, then cross-review each other's findings. Agreements are confirmed. Hallucinations are caught and penalized. Over time, each agent builds an accuracy profile — the system learns who to trust for what.


Why multi-agent?

Without gossipcat With gossipcat
One AI reviews your code — and hallucinates a finding you waste 20 minutes on Multiple agents cross-check each other — hallucinations get caught before you see them
Every agent gets the same tasks regardless of track record Dispatch weights route tasks to the agent with the best accuracy in that category
An agent keeps making the same class of mistake Skill files are auto-generated from failure data and injected into future prompts
You don't know which agent to trust Accuracy, uniqueness, and reliability scores are tracked per agent, per category

Gossipcat is right for you if

  • You want multiple AI models catching different classes of bugs
  • You don't trust a single agent to catch everything
  • You want agents to cross-check each other's findings before you act on them
  • You want to know which agents are actually accurate vs. hallucinating
  • You want agents that get better over time based on their track record

Features

Consensus Review

3+ agents review independently, then cross-review each other. Findings tagged as CONFIRMED, DISPUTED, or UNIQUE.

Adaptive Dispatch

Agent accuracy is tracked per-category. Dispatch weights adjust automatically — the best agent for the job gets picked.

Skill Development

When an agent keeps failing in a category, targeted skills are generated from failure data and injected into future prompts.

Multi-Provider

Mix Anthropic, Google, OpenAI, and OpenClaw agents in one team. Each brings different strengths. Native agents need no API key. 🦞 Lobster friendly.

Live Dashboard

Real-time view of tasks, consensus reports, agent scores, and activity feed. Terminal Amber theme. WebSocket updates.

Agent Memory

Per-agent cognitive memory persists across sessions. Agents remember past findings, patterns, and project context.

Works
with
Claude Code
Full support
Cursor
Not yet
Windsurf
Not yet
VS Code
Not yet

Provider
gateways
OpenClaw
HTTP gateway ✅
Ollama
Local models ✅
OpenAI-compatible
Any base_url ✅

How it works

  dispatch ──→ parallel review ──→ cross-review ──→ consensus
                                                       │
                                                 ┌─────┴─────┐
                                                 ▼           ▼
                                             signals    skill development
                                                 │           │
                                                 ▼           ▼
                                          dispatch weights   targeted prompts
                                          (who gets picked)  (agent improves)
Step What happens
Dispatch Tasks routed to agents based on dispatch weights (accuracy history per category)
Parallel review Agents work independently, each producing findings with confidence scores
Cross-review Each agent reviews peers' findings: agree, disagree, unverified, or new finding
Consensus Findings deduplicated and tagged: CONFIRMED, DISPUTED, UNVERIFIED, UNIQUE
Signals You verify findings against code and record accuracy signals
Skill development Agents with repeated failures get targeted skill files injected into future prompts

Two types of agents

Native Relay
Runs as Claude Code subagent (Agent() tool) WebSocket worker on relay server
Providers Anthropic (Claude) Google (Gemini), OpenAI, any provider
API key None — uses your Claude Code subscription Required per provider
Defined in .claude/agents/*.md .gossip/config.json
Consensus Yes Yes
Memory & Skills Yes Yes

Both types participate equally in consensus, cross-review, and skill development.


Quickstart

Requirements: Node.js 22+

1. Clone and build

git clone https://github.com/ataberk-xyz/gossipcat-ai.git
cd gossipcat-ai
npm install
npm run build:mcp

npm install generates .mcp.json with the correct paths for your machine. build:mcp bundles the MCP server. Open Claude Code in this directory and gossipcat connects automatically.

To register globally (available in all projects):

claude mcp add gossipcat -s user -- node /absolute/path/to/gossipcat-ai/dist-mcp/mcp-server.js

2. Build the dashboard (optional)

npm run build:dashboard

Launches automatically on port 24420. Skip this if you don't need the visual dashboard.

3. API keys

Add env vars for the providers you want to use. Pass them with -e when registering, or set them in your shell environment.

Provider Env var Notes
Google Gemini GOOGLE_API_KEY For Gemini relay agents
OpenAI OPENAI_API_KEY For OpenAI relay agents
Anthropic Native agents use your Claude Code subscription — no key needed

Example with Gemini:

claude mcp add gossipcat -s user -e GOOGLE_API_KEY=your-key -- node /path/to/gossipcat/dist-mcp/mcp-server.js

Keys are stored persistently and cross-platform:

  • macOS — OS Keychain
  • Linux — Secret Service (secret-tool)
  • Windows / other — AES-256-GCM encrypted file

4. Initialize your team

Start a Claude Code session in any project and ask Claude to set up your team:

"Set up a gossipcat team with a Gemini reviewer and a Sonnet implementer"

Claude Code calls gossip_setup() to create your .gossip/config.json and agent definitions. You choose the providers, models, and roles — gossipcat adapts to your setup.

Available presets: reviewer, implementer, tester, researcher, debugger, architect, security, designer, planner, devops, documenter


Use Cases

Build something — gossipcat picks the team

"I want to build a Stripe integration, set up a team for that"
"I'm adding real-time notifications — what agents do I need?"
"Set up a team for a TypeScript REST API project"

Describe what you're building. Gossipcat proposes an agent team tailored to the task — right presets, right skills, right mix of providers. You review the proposal and approve it. From that point on, agents dispatch automatically based on what your code touches.


Review code before committing

"Review the changes I just made"
"Do a consensus review on the auth module"
"Check my last 3 commits for bugs"

Three agents review your diff independently, then cross-check each other's findings. You get a report with CONFIRMED bugs (multiple agents agree), DISPUTED findings (agents disagree), and UNIQUE findings (only one agent found it). You only act on what's verified.


Catch security issues

"Security audit the payment handler"
"Check the login flow for vulnerabilities"
"Review the API endpoints for injection risks"

Dispatch your security-focused agents in parallel. Each reviews from a different angle — one checks OWASP vectors, another checks input validation, another checks auth logic. Findings that survive cross-review are real.


Research a codebase before building

"Research how the WebSocket connection lifecycle works before I touch it"
"Explain the dispatch pipeline — I need to add a new routing mode"

Agents read the code, trace call paths, and write a summary back to session memory. Next time you ask about the same area, they already know it.


Get a second opinion on your own review

"I think there's a race condition in this Map — check if I'm right"
"Verify whether this fix actually resolves the issue"

Describe what you think you're seeing. Agents check independently and either confirm or disprove it. Author self-review is optimistic by nature — this isn't.


Track which agents are actually reliable

"Show me agent scores"
"Which agent is best at security reviews?"

Every finding gets verified and turned into a signal. Accuracy, uniqueness, and reliability are tracked per agent. Over time, dispatch weights shift — the agents that keep catching real bugs get more work.


Improve a struggling agent

"Gemini keeps hallucinating about concurrency — fix it"
"Develop a skill for the reviewer's repeated type-safety misses"

Gossipcat generates a targeted skill file from the agent's failure data and injects it into future prompts. Signals penalize past mistakes; skills prevent future ones.


Usage

Once gossipcat is installed, you interact with it through natural language in Claude Code. The CLAUDE.md rules file (auto-generated on first boot) teaches Claude Code how to use the gossipcat tools — you just describe what you want.

What to say to Claude Code

What you want What to type
Review your latest changes "Review my recent changes"
Deep review of critical code "Do a consensus review on the auth module"
Catch security issues "Security audit the payment handler"
Research before building "How does the dispatch pipeline work?"
Get a second opinion "Check if I'm right about this race condition"
Check which agents are performing well "Show me agent scores"
Improve a struggling agent "Develop a skill for the reviewer's type-safety misses"
Save context for next session "Save session"

Claude Code reads the dispatch rules from .claude/rules/gossipcat.md and automatically decides whether to use single-agent, parallel, or consensus mode based on what your change touches.

Example session

You:    "Review the changes I made to the relay server with the gossipcat team"

Claude: Dispatches 3 agents via gossip_dispatch(mode: "consensus")
        → sonnet-reviewer checks for security issues
        → gemini-reviewer checks for logic bugs
        → gemini-tester checks for edge cases

        Cross-review round: agents review each other's findings

        Consensus report:
        ✓ CONFIRMED: race condition in connection cleanup (3/3 agree)
        ✓ CONFIRMED: missing error handler on WebSocket close (2/3 agree)
        ? UNVERIFIED: potential memory leak in Map (1 found, others couldn't verify)

        Claude verifies the UNVERIFIED finding against your code,
        records accuracy signals, and presents the final report.

Under the hood

Claude Code translates your requests into gossipcat MCP tool calls:

gossip_run(agent_id: "auto", task: "...")        → single-agent task
gossip_dispatch(mode: "consensus", tasks: [...]) → multi-agent review
gossip_collect(consensus: true)                  → cross-review + report
gossip_signals(action: "record", signals: [...]) → record accuracy
gossip_scores()                                  → view agent performance
gossip_skills(action: "develop", ...)            → improve struggling agents

You don't need to type these — Claude Code handles tool selection. But you can call them directly if you want fine-grained control.


For AI Agents

Currently supported orchestrator: Claude Code only. Cursor, Windsurf, and VS Code agents do not yet have native agent support and cannot run consensus cross-review. If you are reading this as a Claude Code instance with the gossipcat MCP server connected, this guide applies to you.

If you are a Claude Code instance reading this to understand how to use gossipcat, here is the complete operational guide.

You are the orchestrator. Gossipcat gives you tools to dispatch work to a team of specialized agents, collect their results, run cross-review consensus, and record accuracy signals. You do not implement code directly — you coordinate.

Step 1 — Check what you have

gossip_status()

This tells you: which agents are available, their providers/models, relay status, dashboard URL, and suggested workflows. Always call this first in a new session.

Step 2 — Dispatch work

Single agent (auto-selected):

gossip_run(agent_id: "auto", task: "Review the auth module for security issues")

The system picks the best agent for the task based on past accuracy. For relay agents, this returns the result directly. For native agents (Claude Code subagents), it returns ⚠️ EXECUTE NOW instructions — follow them immediately in the same response.

Parallel (fan-out, no consensus):

gossip_dispatch(mode: "parallel", tasks: [
  { agent_id: "gemini-reviewer", task: "..." },
  { agent_id: "sonnet-reviewer", task: "..." }
])

Consensus (cross-review):

gossip_dispatch(mode: "consensus", tasks: [
  { agent_id: "gemini-reviewer", task: "..." },
  { agent_id: "sonnet-reviewer", task: "..." },
  { agent_id: "haiku-researcher", task: "..." }
])

Step 3 — Collect results

gossip_collect(task_ids: ["id1", "id2", "id3"], consensus: true)

With consensus: true, agents cross-review each other's findings. If native agents are in the round, gossip_collect returns ⚠️ EXECUTE NOW with prompts — dispatch those Agent() calls immediately, then relay each result via gossip_relay_cross_review.

Step 4 — Verify and record signals

After consensus, verify every UNVERIFIED finding against the actual code (grep/read the cited files). Then record signals:

gossip_signals(action: "record", signals: [{
  signal: "unique_confirmed",   // or "hallucination_caught", "agreement"
  agent_id: "gemini-reviewer",
  finding: "Race condition in task map at line 47",
  finding_id: "<consensus_id>:<agent_id>:f1"   // mandatory
}])

Signals update dispatch weights. Agents that hallucinate get penalized. Agents that catch real bugs get promoted.

Key rules

  • Always follow ⚠️ EXECUTE NOW — dispatch those Agent() calls in the same response, do not wait.
  • Never leave UNVERIFIED findings unexamined — read the code, confirm or deny, record the signal.
  • finding_id is mandatory on every signal — format: <consensus_id>:<agent_id>:fN.
  • Use gossip_progress after reconnect — if a consensus round was in flight, it re-surfaces the pending EXECUTE NOW prompts.

When to use consensus

Use gossip_dispatch(mode: "consensus") when the change touches: shared mutable state, auth/sessions, file persistence, or the core dispatch pipeline. Use gossip_run for single-agent research, exploration, or review tasks that don't need cross-validation.


MCP Tools

These tools are called by the internal LLM (the orchestrator — Claude Code with gossipcat MCP). You don't invoke them manually; the orchestrator selects and calls them based on your requests.

Tool Purpose
gossip_status System status, dashboard URL, agent list
gossip_run Single-agent dispatch with auto agent selection
gossip_dispatch Multi-agent dispatch: single, parallel, or consensus
gossip_collect Collect results with optional cross-review synthesis
gossip_relay Feed native agent results back into the pipeline
gossip_relay_cross_review Feed native cross-review results into consensus
gossip_plan Decompose task into sub-tasks with agent assignments
gossip_signals Record or retract accuracy signals
gossip_scores View agent accuracy, uniqueness, and dispatch weights
gossip_skills Develop, bind, unbind, or list per-agent skills
gossip_setup Create or update agent team
gossip_session_save Save session context for next session
gossip_remember Search an agent's cognitive memory
gossip_progress Check in-progress task status
gossip_tools List all available tools
gossip_update Check for or apply gossipcat updates from npm

Dashboard

Build the dashboard (one time):

npm run build:dashboard

The dashboard launches automatically on port 24420 when gossipcat boots. Run gossip_status to get the URL and auth key:

Dashboard: http://localhost:24420/dashboard (key: a1b2c3...)

A new auth key is generated each session. Paste it when prompted to log in.

Built with React + Vite + shadcn/ui:

  • Overview — agent cards with dispatch weights, recent tasks, finding metrics
  • Team — all agents sorted by reliability
  • Tasks — task history with agent, duration, and status
  • Findings — consensus reports with CONFIRMED/DISPUTED/UNVERIFIED breakdowns
  • Agent detail — per-agent memory, skills, scores, and task history

Live updates via WebSocket — every tool call pushes events to connected clients.


Architecture

gossipcat/
  apps/
    cli/                  MCP server, native agent bridge, boot sequence
  packages/
    orchestrator/         Dispatch pipeline, consensus engine, memory, skills,
                          performance scoring, task graph, prompt assembly
    relay/                WebSocket relay server, dashboard REST/WS API
    dashboard-v2/         React + Vite frontend (Terminal Amber theme)
    client/               Lightweight WebSocket client for relay connections
    tools/                File/shell/git tool implementations for worker agents
    types/                Shared TypeScript types and message protocol

OpenClaw Integration

OpenClaw Lobster friendly

Gossipcat supports OpenClaw as a provider gateway. OpenClaw runs locally and exposes an OpenAI-compatible HTTP API — gossipcat talks to it like any other relay agent, with your stored gateway token and a separate quota slot so OpenClaw rate limits never bleed into your OpenAI agents.

Wiring an OpenClaw agent

Store your gateway token once (macOS):

security add-generic-password -s gossip-mesh -a openclaw -w <your-gateway-token>

On Linux:

secret-tool store --label "Gossip Mesh openclaw" service gossip-mesh provider openclaw
# (enter token when prompted)

Then add it to your team:

"Add an OpenClaw reviewer to my team"

Or directly via gossip_setup:

gossip_setup(mode: "merge", agents: [{
  id: "openclaw-agent",
  type: "custom",
  provider: "openclaw",
  custom_model: "openclaw/default",
  role: "reviewer",
  skills: ["code_review", "typescript"]
}])

The gateway runs at http://127.0.0.1:18789/v1 by default. Override with base_url if yours is on a different port. Available models: openclaw, openclaw/default, openclaw/main.

Once added, the agent participates in consensus rounds, accumulates accuracy signals, and gets skill files generated from its failure patterns — same as any other agent in the mesh.


Configuration

Config is searched in order: .gossip/config.json > gossip.agents.json > gossip.agents.yaml.

{
  "main_agent": {
    "provider": "google",
    "model": "gemini-2.5-pro"
  },
  "utility_model": {
    "provider": "native",
    "model": "haiku"
  },
  "consensus_judge": {
    "provider": "anthropic",
    "model": "claude-sonnet-4-6",
    "native": true
  },
  "agents": {
    "sonnet-reviewer": {
      "provider": "anthropic",
      "model": "claude-sonnet-4-6",
      "preset": "reviewer",
      "skills": ["code_review", "security_audit", "typescript"],
      "native": true
    }
  }
}
Field Description
main_agent Internal tool LLM for routing, planning, and synthesis
utility_model Memory compaction, gossip, lens generation
consensus_judge Model for cross-review synthesis
agents.<id>.provider anthropic, google, openai, openclaw, local
agents.<id>.base_url Custom endpoint for openai/openclaw (e.g. http://127.0.0.1:18789/v1)
agents.<id>.native true = runs via Claude Code Agent(), no API key
agents.<id>.preset reviewer, implementer, tester, researcher, debugger, architect, security, designer, planner, devops, documenter
agents.<id>.skills Skill labels for dispatch matching

Host compatibility

Gossipcat auto-detects the host environment:

Host Native agents Rules file
Claude Code Yes .claude/rules/gossipcat.md
Cursor No .cursor/rules/gossipcat.mdc
Windsurf No .windsurfrules
VS Code No

Roadmap

Feature Status
Consensus code review ✅ Shipped
Adaptive dispatch weights ✅ Shipped
Per-agent skill development ✅ Shipped
Agent cognitive memory ✅ Shipped
Live dashboard ✅ Shipped
Cross-platform key storage ✅ Shipped
OpenAI-compatible gateway support (base_url) ✅ Shipped
OpenClaw provider integration 🦞 ✅ Shipped
Full implementation workflow (agents write code) 🔄 In progress
Dashboard enrichment (graphs, trends, session history) ☐ Planned
Local Postgres migration (embedded Postgres for tasks/signals/consensus/memory — unblocks full task results, real queries, no more JSONL scans) ☐ Planned
Local LLM support (Ollama) ☐ Planned
Full Cursor support ☐ Planned
Windsurf / VS Code parity ☐ Planned
Standalone CLI (no IDE required) ☐ Planned
CLI parity with MCP pipeline (gossip, task graph, agent memory in chat mode) ☐ Planned


Star History

Star History Chart

License

MIT

Yorumlar (0)

Sonuc bulunamadi