xspace-agent

agent
Guvenlik Denetimi
Uyari
Health Uyari
  • License — License: NOASSERTION
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Low visibility — Only 7 GitHub stars
Code Uyari
  • fs module — File system access in agent-voice-chat/agent-registry.js
Permissions Gecti
  • Permissions — No dangerous permissions requested
Purpose
This TypeScript SDK and CLI tool allows developers to build AI agents that autonomously join, listen, and speak in X/Twitter Spaces using various speech-to-text, LLM, and text-to-speech providers.

Security Assessment
Overall risk is Medium. The tool requires highly sensitive authentication credentials, specifically your X/Twitter session cookies (`X_AUTH_TOKEN` and `X_CT0`), alongside API keys for AI services. It inherently makes external network requests to communicate with these LLM and voice providers. The automated scan flagged file system access in the agent registry module, which is standard for managing local state but still requires standard caution. No hardcoded secrets or overly dangerous shell execution permissions were detected.

Quality Assessment
The project is new and very experimental, evidenced by a low community footprint of only 7 GitHub stars. On a positive note, it is highly active with repository updates pushed as recently as today. However, the software license is marked as `NOASSERTION`, meaning it lacks a clearly defined open-source license. This is a significant drawback for enterprise use, as it leaves the legal terms for modifying or distributing the code ambiguous.

Verdict
Use with caution — the tool is actively maintained and lacks critical malware indicators, but you must carefully handle the required sensitive account cookies and be aware of its undefined software license.
SUMMARY

TypeScript SDK and CLI for building AI agents that autonomously join, listen, and speak in X/Twitter Spaces. Supports multiple LLM providers (OpenAI, Claude, Groq), speech-to-text (Whisper, Deepgram), text-to-speech (ElevenLabs, OpenAI), multi-agent coordination, middleware pipelines, and a real-time admin dashboard. No Twitter API approval needed

README.md

X Space Agent

X Space Agent

AI agents that join and talk in X/Twitter Spaces

Quick StartFeaturesStructureArchitectureExamplesDocsContributing


X Space Agent — Multi-agent AI voice conversations
Multi-agent AI voice conversations in X/Twitter Spaces — real-time transcription, LLM responses, and voice synthesis

What is this?

X Space Agent is a TypeScript SDK that lets you build AI agents that autonomously join, listen, and speak in X/Twitter Spaces. Connect any LLM, any voice provider, and ship in minutes. No Twitter API approval needed.

import { XSpaceAgent } from 'xspace-agent'

const agent = new XSpaceAgent({
  auth: { token: process.env.X_AUTH_TOKEN!, ct0: process.env.X_CT0! },
  ai: { provider: 'openai', apiKey: process.env.OPENAI_API_KEY! },
})

agent.on('transcription', ({ text }) => console.log('Heard:', text))
agent.on('response', ({ text }) => console.log('Said:', text))

await agent.join('https://x.com/i/spaces/YOUR_SPACE_ID')

Or skip the code entirely with the CLI:

npx xspace-agent join https://x.com/i/spaces/YOUR_SPACE_ID --provider openai

Features

🎤
Multi-Provider LLM
OpenAI, Claude, Groq,
or any custom API
👥
Multi-Agent Teams
Run multiple personalities
with turn management
🔧
Middleware Pipeline
Hook into STT → LLM → TTS
at any stage
💻
Zero-Code CLI
npx xspace-agent join <url>
no SDK needed
📊
Admin Dashboard
Web UI to monitor and
control live agents
🔷
TypeScript-First
Full type safety,
autocomplete included

Requirements

  • Node.js >= 18 (tested on 18, 20, 22)
  • pnpm >= 9 (for monorepo development) or npm/yarn for consuming the SDK
  • Chromium — bundled with Puppeteer, or provide your own via BROWSER_MODE=connect
  • X (Twitter) account — cookie-based auth (X_AUTH_TOKEN + X_CT0) or username/password
  • At least one AI provider key — OpenAI, Anthropic, or Groq

Quick Start

1. Install

npm install xspace-agent

2. Set environment variables

# .env
X_AUTH_TOKEN=your_x_auth_token
X_CT0=your_x_ct0_cookie
OPENAI_API_KEY=sk-...

Get X_AUTH_TOKEN and X_CT0 from your browser cookies after logging into X. Guide →

3. Run

import { XSpaceAgent } from 'xspace-agent'

const agent = new XSpaceAgent({
  auth: { token: process.env.X_AUTH_TOKEN!, ct0: process.env.X_CT0! },
  ai: {
    provider: 'openai',
    apiKey: process.env.OPENAI_API_KEY!,
    model: 'gpt-4o',
    systemPrompt: 'You are a helpful AI analyst. Be concise and data-driven.',
  },
  voice: {
    sttProvider: 'deepgram',
    ttsProvider: 'elevenlabs',
    voiceId: 'rachel',
  },
})

agent.on('transcription', ({ text, speaker }) => console.log(`${speaker}: ${text}`))
agent.on('response', ({ text }) => console.log(`Agent: ${text}`))

await agent.join('https://x.com/i/spaces/YOUR_SPACE_ID')

Or skip the code entirely with the CLI:

npx xspace-agent join https://x.com/i/spaces/YOUR_SPACE_ID --provider openai

Deploy

Deploy on Railway   Deploy to Render

Or with Docker:

docker run -e OPENAI_API_KEY=sk-... ghcr.io/nirholas/xspace-agent

Documentation

Full docs live in docs/. Key guides:

Guide Description
Architecture Overview How the system fits together
Providers LLM, STT, and TTS provider setup
Admin Panel Web dashboard guide
Environment Variables All config options
Multi-Space Support Run agents across multiple Spaces
Agent Memory & RAG Persistent memory and retrieval
TypeScript Migration TypeScript usage guide

Project Structure

This is a pnpm monorepo with five publishable packages, a standalone voice agent, and supporting infrastructure.

Packages (npm-published)

packages/
  core/                → xspace-agent         The main SDK. Everything needed to build an AI agent
                         ├── agent.ts            Entry point — orchestrates browser, audio, LLM, turns
                         ├── team.ts             Multi-agent coordination (multiple AIs, one Space)
                         ├── audio/              PCM capture, VAD, silence detection, WAV encoding, TTS injection
                         ├── browser/            Puppeteer lifecycle, self-healing selector engine, DOM interaction
                         ├── fsm/                Finite state machine for agent & team lifecycles
                         ├── intelligence/       Speaker ID, topic tracking, sentiment, context management
                         ├── pipeline/           Provider factories — createLLM(), createSTT(), createTTS()
                         ├── turns/              Turn coordination, decision engine, interruption handling
                         ├── plugins/            Plugin system with 6 middleware hooks (before/after stt/llm/tts)
                         ├── providers/          Multi-provider router and cost tracking
                         ├── db/                 Drizzle ORM, migrations, repositories
                         ├── auth/               X/Twitter login, token validation, OAuth, SAML
                         ├── memory/             Conversation persistence, RAG, archiving
                         ├── observability/      Structured logging (Pino), tracing, metrics
                         └── __tests__/          Unit & E2E test suites with fixtures

  server/              → @xspace/server        Express + Socket.IO admin panel
                         ├── routes/             REST API endpoints
                         ├── events/             Socket.IO real-time event handlers
                         ├── middleware/          Auth, validation, CORS, rate limiting
                         ├── schemas/            Zod request/response validation
                         ├── personalities/      Preset agent configurations
                         └── public/             Admin dashboard HTML/CSS/JS

  cli/                 → @xspace/cli           Command-line tool
                         └── commands/           init, auth, join, start, dashboard

  widget/              → @xspace/widget        Embeddable voice chat widget (UMD + ESM builds)
                         ├── connection.ts       WebSocket connection handler
                         ├── theme.ts            Theme customization
                         └── ui/                 UI components

  create-xspace-agent/ → create-xspace-agent   Project scaffolding (like create-react-app)
                         └── templates/base/     Starter project template

Application Code

agent-voice-chat/      Standalone voice chat agent — separate from the monorepo
                       ├── server.js             Express + Socket.IO server (38KB)
                       ├── openapi.json           Full REST API spec
                       ├── agents.config.json     Agent configurations
                       ├── room-manager.js        Multi-room coordination
                       ├── knowledge/             Vector embeddings & RAG data
                       ├── memory/                Persistent conversation storage
                       ├── providers/             LLM, STT, TTS implementations
                       └── tests/                 Own test suite (vitest)

src/                   Legacy monolithic server — functional via `npm run dev`, being migrated
                       ├── server/                Express server, socket handlers, routes, metrics
                       ├── browser/               Puppeteer auth, launcher, orchestrator, selectors
                       ├── audio/                 Audio stream bridge
                       └── client/                Frontend initialization & provider configs

x-spaces/             Low-level Puppeteer automation scripts (JavaScript)
                       ├── index.js               Orchestration entry point
                       ├── audio-bridge.js         Audio capture & injection via CDP
                       ├── auth.js                 Browser cookie authentication
                       └── space-ui.js             X Spaces DOM interaction & selectors

Supporting Directories

examples/              12 runnable projects — basic-join, multi-agent-debate, discord-bridge,
                       custom-provider, middleware-pipeline, express-integration, scheduled-spaces,
                       chrome-connect, with-plugins, and more. Each has its own package.json.

docs/                  43 markdown files — architecture overview, API reference (REST + WebSocket),
                       provider guides, deployment (Docker, Railway, Render, VPS), troubleshooting,
                       plugin system, configuration, and internal design specs.

personalities/         Pre-built agent personalities with system prompts & voice preferences
                       └── presets/               agent-zero, comedian, crypto-degen, educator,
                                                  interviewer, tech-analyst, and more

providers/             AI provider wrappers (JS) — Claude, Groq, OpenAI Chat, OpenAI Realtime, STT, TTS

public/                Frontend assets — admin dashboard, agent builder, voice chat UI,
                       widget demos (React, Vue), landing pages

docker/                Monitoring stack — Prometheus scrape configs + Grafana dashboards

tasks/                 14 implementation specs & roadmap items (landing page, design system,
                       docs site, onboarding flow, admin dashboard v2, auth, rate limiting, etc.)

tests/                 Top-level integration & load tests

Examples

Example Description
basic-join Join a Space with an AI agent in ~15 lines
transcription-logger Listen-only — save timestamped transcripts to file
multi-agent-debate Two AIs (Bull vs Bear) debate live with round-robin turns
multi-agent Multiple AI agents sharing a single Space
custom-provider Use a local LLM (Ollama) or any custom API backend
middleware-pipeline Content filtering, language detection, safety redaction, analytics hooks
express-integration Embed the agent in an existing Express app with admin panel
scheduled-spaces Join Spaces on a cron schedule with auto-leave timers
discord-bridge Control the agent from Discord — join, leave, speak, stream transcriptions
chrome-connect Connect to an existing Chrome instance instead of launching one
with-plugins Extend agent behavior with custom plugins
plugins Reusable plugin modules — analytics, moderation, webhooks
cd examples/basic-join
npm install
cp .env.example .env   # fill in your API keys
npm start

Architecture

                         X Space (live audio)
                                │
                    Puppeteer + Chrome DevTools Protocol
                                │
                    ┌───────────▼────────────┐
                    │   BrowserLifecycle      │  Auth → Join → Request Speaker → Speak
                    │   Self-healing CSS/     │  Retries selectors via CSS → text → aria
                    │   text/aria selectors   │
                    └───────────┬────────────┘
                                │  RTCPeerConnection audio hooks
                    ┌───────────▼────────────┐
                    │   AudioPipeline         │  PCM capture → VAD → silence detection
                    │                         │  → WAV encoding → TTS injection
                    └───────────┬────────────┘
                                │
          ┌─────────────────────┼─────────────────────┐
          │                     │                     │
   ┌──────▼──────┐      ┌──────▼──────┐      ┌──────▼──────┐
   │  STT        │      │  LLM        │      │  TTS        │
   │  Deepgram   │      │  OpenAI     │      │  ElevenLabs │
   │  Whisper    │      │  Claude     │      │  OpenAI TTS │
   │  (Groq/OAI) │      │  Groq       │      │  Browser    │
   └──────┬──────┘      │  Custom     │      └──────┬──────┘
          │              └──────┬──────┘             │
          │    before:stt       │    before:llm      │    before:tts
          │    after:stt        │    after:llm       │    after:tts
          │  ← middleware →     │  ← middleware →     │  ← middleware →
          │                     │                     │
   ┌──────▼─────────────────────▼─────────────────────▼──────┐
   │  Intelligence Layer                                      │
   │  Speaker ID · Topic tracking · Sentiment · Context mgmt  │
   └─────────────────────────┬───────────────────────────────┘
                             │
   ┌─────────────────────────▼───────────────────────────────┐
   │  Turn Management + FSM                                   │
   │  Decision engine · Interruption handling · Response pace  │
   │                                                          │
   │  idle → launching → authenticating → joining → listening │
   │                                          ↕               │
   │                                       speaking → leaving │
   └──────────────────────────────────────────────────────────┘

The agent connects to X Spaces via a headless Chromium browser, hooks into the WebRTC audio stream, and routes it through a fully configurable STT → LLM → TTS pipeline. Every stage supports middleware for logging, filtering, translation, content moderation, and more. The intelligence layer attributes speech to speakers, tracks topics, and manages conversation context. A finite state machine governs the full agent lifecycle.

Providers

Category Providers
LLM OpenAI (GPT-4o), Anthropic (Claude), Groq (Llama/Mixtral), any OpenAI-compatible API
Speech-to-Text Deepgram (streaming), OpenAI Whisper, custom
Text-to-Speech ElevenLabs, OpenAI TTS, custom

CLI Reference

xspace-agent init                  # Interactive setup wizard
xspace-agent auth                  # Authenticate with X
xspace-agent join <url>            # Join a Space
xspace-agent start                 # Start agent with admin panel
xspace-agent dashboard             # Launch web dashboard only

Used By

Be the first! Open a PR to add your project.

Community

Contributing

We welcome contributions! See CONTRIBUTING.md for setup instructions and guidelines.

Good first contributions:

  • Add a new AI provider (Mistral, Cohere, Together)
  • Add a new TTS provider (Cartesia, PlayHT)
  • Build an example project
  • Improve documentation

License

All Rights Reserved © 2026 nirholas

Yorumlar (0)

Sonuc bulunamadi