Connapse

mcp
SUMMARY

Open-source, self-hosted knowledge backend for AI agents — hybrid search (vector + keyword), MCP server, 5 connectors, Docker-ready

README.md

Connapse logo

Stop losing context between AI sessions. Give your agents persistent, searchable memory.

License: MIT .NET Build Tests PRs Welcome GitHub Issues GitHub Stars Connapse MCP server Docker

Connapse demo — upload a PDF, search with hybrid vector and keyword search, get results with source citations in seconds

Your AI agents forget everything between sessions. Connapse fixes that.

Every time you start a new conversation, your AI agent starts from zero — no memory of past research, no access to your documents, no accumulated knowledge. Connapse is an open-source knowledge backend that gives agents persistent, searchable memory. Upload documents or point it at your existing Amazon S3 buckets, Azure Blob Storage containers, or local filesystems. Agents query and build their own research corpus via 11 MCP tools, REST API, or CLI. Container-isolated, hybrid search (vector + keyword), self-hosted and private. Deploy in 60 seconds with Docker. Built on .NET 10.

🤖 AI Agent Integration — Claude queries and builds your knowledge base via MCP

Claude querying Connapse knowledge base via MCP server — asks about preventing cascading failures in microservices, gets structured answer with circuit breaker pattern details cited from distributed-systems-notes.md

AI agents query your knowledge base through the MCP server, receiving structured answers with source citations from your documents.

🎛️ Your Knowledge, Your Rules — Runtime configuration without restarting

Connapse settings panel — switching embedding providers, adjusting chunking parameters, and configuring search settings at runtime without restart

Switch embedding providers, tune chunking parameters, and configure search — all at runtime, without restarting.


📦 Quick Start

git clone https://github.com/Destrayon/Connapse.git && cd Connapse && docker-compose up -d
# Open http://localhost:5001

Prerequisites

Run with Docker Compose

# Clone the repository
git clone https://github.com/Destrayon/Connapse.git
cd Connapse

# Set required auth environment variables (or use a .env file)
export [email protected]
export CONNAPSE_ADMIN_PASSWORD=YourSecurePassword123!
export Identity__Jwt__Secret=$(openssl rand -base64 64)

# Start all services (PostgreSQL, MinIO, Web App)
docker-compose up -d

# Open http://localhost:5001 — log in with the admin credentials above

The first run will:

  1. Pull Docker images (~2-5 minutes)
  2. Initialize PostgreSQL with pgvector extension and run EF Core migrations
  3. Create MinIO buckets
  4. Seed the admin account (from env vars) and start the web application

Development Setup

# Start infrastructure only (database + object storage)
docker-compose up -d postgres minio

# Run the web app locally
dotnet run --project src/Connapse.Web

# Run all tests
dotnet test

# Run just unit tests
dotnet test --filter "Category=Unit"

Using the CLI

Install the CLI (choose one option):

# Option A: .NET Global Tool (requires .NET 10)
dotnet tool install -g Connapse.CLI

# Option B: Download native binary from GitHub Releases (no .NET required)
# https://github.com/Destrayon/Connapse/releases

Basic usage:

# Authenticate first
connapse auth login --url https://localhost:5001

# Create a container (project)
connapse container create my-project --description "My knowledge base"

# Upload files
connapse upload ./documents --container my-project

# Search
connapse search "your query" --container my-project

# Update to latest release (--pre to include alpha/pre-release builds)
connapse update
connapse update --pre

Using with Claude (MCP)

Connapse includes a Model Context Protocol (MCP) server for integration with Claude and any MCP client.

Setup: Create an agent API key via the web UI (Settings → Agent API Keys) or CLI (connapse auth agent-key create), then add the config snippet for your client:

Claude Code (CLI)
claude mcp add connapse --transport streamable-http http://localhost:5001/mcp --header "X-Agent-Api-Key: YOUR_API_KEY"
Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "connapse": {
      "transport": "streamable-http",
      "url": "http://localhost:5001/mcp",
      "headers": {
        "X-Agent-Api-Key": "YOUR_API_KEY"
      }
    }
  }
}
VS Code / Cursor

Add to your .vscode/settings.json (VS Code) or Cursor MCP config:

{
  "mcp": {
    "servers": {
      "connapse": {
        "transport": "streamable-http",
        "url": "http://localhost:5001/mcp",
        "headers": {
          "X-Agent-Api-Key": "${input:connapseApiKey}"
        }
      }
    }
  }
}

VS Code will prompt for the API key on first use.

The MCP server exposes 11 tools:

Tool Description
container_create Create a new container for organizing files
container_list List all containers with document counts
container_delete Delete a container
container_stats Get container statistics (documents, chunks, storage, embeddings)
upload_file Upload a single file to a container
bulk_upload Upload up to 100 files in one operation
list_files List files and folders at a path
get_document Retrieve full parsed text content of a document
delete_file Delete a single file from a container
bulk_delete Delete up to 100 files in one operation
search_knowledge Semantic, keyword, or hybrid search within a container

Full reference: See docs/mcp-tools.md for parameter tables, return formats, error cases, and usage examples.

Write guards: Amazon S3 and Azure Blob Storage containers are read-only (synced from source). Filesystem containers respect per-container permission flags. Upload and delete tools will return an error for containers that block writes.

Example prompts — what to ask your agent
  • "Create a container called 'project-research' for my architecture notes"
  • "Upload all the PDFs in my downloads folder to the project-research container"
  • "Search my project-research container for information about rate limiting strategies"
  • "List all files in the /notes/ folder of my project-research container"
  • "Get the full text of distributed-systems-notes.md from project-research"
  • "Delete meeting-2026-03-14.md from project-research and upload this updated version"
  • "Delete all files in the /drafts/ folder of project-research"
  • "How many documents and chunks are in my project-research container?"
Troubleshooting

Connection refused on localhost:5001 — Docker not running or port conflict. Check docker compose ps and docker compose logs web.

401 Unauthorized / API key not working — Verify the key in Settings > Agent API Keys. Keys are shown once at creation.

Tools not appearing in Claude — Restart your MCP client after config changes. Verify endpoint with curl http://localhost:5001/mcp.

Uploads failing or timing out — Check file type is in the allowlist. Max file size depends on server config.

Search returns no results — Documents need time to embed after upload. Check container stats for embedding progress.


🚀 Features

  • 🗂️ Container-Isolated Knowledge — Each project gets its own vector index, storage connector, and search configuration. No cross-contamination between projects, teams, or clients.
  • 🔍 Hybrid Search — Vector similarity + keyword full-text with configurable fusion (convex combination, DBSF, AutoCut). Get results that pure vector search misses.
  • 🧠 Multi-Provider AI — Swap between Ollama, OpenAI, Azure OpenAI, and Anthropic for both embeddings and LLM — at runtime, per container, without restarting.
  • 🔌 Index Your Existing Storage — Connect MinIO, local filesystem (live file watching), Amazon S3 (IAM auth), or Azure Blob Storage (managed identity). Your files stay where they are.
  • 🤖 4 Access Surfaces — Web UI, REST API, CLI (native binaries), and MCP server (11 tools for Claude). Built for humans, scripts, and AI agents equally.
  • 🔐 Enterprise Auth — Multi-tier RBAC (Cookie + OAuth 2.1 + PAT + JWT) with AWS IAM Identity Center and Azure AD identity linking. Cloud permissions are the source of truth.
  • 🐳 One-Command Deploy — Docker Compose with PostgreSQL + pgvector, MinIO, and optional Ollama. Structured audit logging and rate limiting built in.
See all features
  • 📄 Multi-Format Ingestion: PDF, Office documents, Markdown, plain text — parsed, chunked, and embedded automatically
  • ⚡ Real-Time Processing: Background ingestion with live progress updates via SignalR
  • 🎛️ Runtime Configuration: Change chunking strategy, embedding model, and search settings per container without restart
  • ☁️ Cloud Identity Linking: AWS IAM Identity Center (device auth flow) + Azure AD (OAuth2+PKCE) with IAM-derived scope enforcement
  • 👥 Invite-Only Access: Admin-controlled user registration with four roles (Admin / Editor / Viewer / Agent)
  • 🤖 Agent Management: Dedicated agent entities with API key lifecycle, scoped permissions, and audit trails
  • 📋 Audit Logging: Structured audit trail for uploads, deletes, container operations, and auth events
  • 📦 CLI Distribution: Native self-contained binaries (Windows/Linux/macOS) and .NET global tool via NuGet
  • 🔄 Cross-Model Search: Switch embedding models mid-project — automatic Semantic→Hybrid fallback for legacy vectors

🎯 Who Is Connapse For?

  • AI agent developers who need a knowledge backend their agents can both query and build — upload research, curate a corpus, and search it via MCP or REST API
  • .NET / Azure teams who want a RAG platform that fits their existing stack and cloud identity
  • Enterprise teams who need project-isolated knowledge bases with proper RBAC and audit trails
  • Anyone tired of re-uploading files — point Connapse at your existing Amazon S3/Azure Blob Storage/filesystem storage
⚠️ Security Status (v0.3.x)

This project is in active development (v0.3.2) and approaching production-readiness.

v0.3.x adds cloud connector architecture with IAM-based access control, multi-provider embeddings and LLM support, cloud identity linking (AWS SSO + Azure AD), and rate limiting.

  • Authentication and authorization (v0.2.0)
  • Role-based access control (Admin / Editor / Viewer / Agent)
  • Audit logging
  • Cloud identity linking — AWS IAM Identity Center + Azure AD OAuth2+PKCE (v0.3.0)
  • IAM-derived scope enforcement — cloud permissions are source of truth (v0.3.0)
  • Rate limiting — built-in ASP.NET Core middleware with per-user and per-IP policies (v0.3.2)
  • ⚠️ Set a strong Identity__Jwt__Secret in production — see deployment guide

See SECURITY.md for the full security policy.


🏗️ Architecture

┌──────────────────────────────────────────────────────────────────────┐
│                         Access Surfaces                              │
│  Web UI (Blazor)  │  REST API  │  CLI  │  MCP Server                │
└─────────────┬────────────────────────────────────────────────────────┘
              │
┌─────────────▼────────────────────────────────────────────────────────┐
│                       Core Services Layer                            │
│  Document Store  │  Vector Store  │  Search  │  Ingestion           │
└─────────────┬────────────────────────────────────────────────────────┘
              │
┌─────────────▼────────────────────────────────────────────────────────┐
│                        Connectors Layer                              │
│  MinIO  │  Filesystem  │  Amazon S3  │  Azure Blob Storage          │
└─────────────┬────────────────────────────────────────────────────────┘
              │
┌─────────────▼────────────────────────────────────────────────────────┐
│                        Infrastructure                                │
│  PostgreSQL+pgvector  │  MinIO (S3)  │  Ollama (optional)           │
└──────────────────────────────────────────────────────────────────────┘

Data Flow: Upload → Search

[Upload] → [Parse] → [Chunk] → [Embed] → [Store] → [Searchable]
              ↓
         [Metadata]
              ↓
        [Document Store]

Target: < 30 seconds from upload to searchable.

Key Technologies:

  • Database: PostgreSQL 17 + pgvector for vector embeddings
  • Object Storage: MinIO (S3-compatible) for original files
  • Backend: ASP.NET Core 10 Minimal APIs
  • Frontend: Blazor Server (interactive mode)
  • Embeddings: Ollama (default), OpenAI, Azure OpenAI (configurable)
  • LLM: Ollama, OpenAI, Azure OpenAI, Anthropic (configurable)
  • Search: Hybrid vector + keyword with convex combination fusion
  • Connectors: MinIO, Filesystem, Amazon S3, Azure Blob Storage

📚 Documentation


🗺️ Roadmap

Connapse is pre-1.0. Major design work is tracked in Discussions.

v0.1.0 — Foundation (Complete)

  • ✅ Document ingestion pipeline (PDF, Office, Markdown, text)
  • ✅ Hybrid search (vector + keyword with convex combination fusion)
  • ✅ Container-based file browser with folders
  • ✅ Web UI, REST API, CLI, MCP server

v0.2.0 — Security & Auth (Complete)

  • ✅ Three-tier auth: Cookie + Personal Access Tokens + JWT (HS256)
  • ✅ Role-based access control (Admin / Editor / Viewer / Agent)
  • ✅ Invite-only user registration (admin-controlled)
  • ✅ First-class agent entities with API key lifecycle
  • ✅ Agent management UI + PAT management UI
  • ✅ Audit logging (uploads, deletes, container operations)
  • ✅ CLI auth commands (auth login, auth whoami, auth pat)
  • ✅ GitHub Actions release pipeline (native binaries + NuGet tool)
  • ✅ 256 passing tests (unit + integration)

v0.3.0 — Connector Architecture (Complete)

  • ✅ 4 connector types: MinIO, Filesystem (FileSystemWatcher), Amazon S3 (IAM-only), Azure Blob Storage (managed identity)
  • ✅ Per-container settings overrides (chunking, embedding, search, upload)
  • ✅ Cloud identity linking: AWS IAM Identity Center (device auth flow) + Azure AD (OAuth2+PKCE)
  • ✅ IAM-derived scope enforcement — cloud permissions are the source of truth
  • ✅ Multi-provider embeddings: Ollama, OpenAI, Azure OpenAI
  • ✅ Multi-provider LLM: Ollama, OpenAI, Azure OpenAI, Anthropic
  • ✅ Multi-dimension vector support with partial IVFFlat indexes per model
  • ✅ Cross-model search: automatic Semantic→Hybrid fallback for legacy vectors
  • ✅ Background sync: FileSystemWatcher for local, 5-min polling for cloud containers
  • ✅ Connection testing for all providers (Amazon S3, Azure Blob Storage, MinIO, LLM, embeddings, AWS SSO, Azure AD)
  • ✅ 457 passing tests (unit + integration)

v0.3.2 — Hardening & Polish (Complete)

  • ✅ Input validation hardening: filename length, path depth, control characters, search params, agent fields
  • ✅ Security fixes: empty API key auth bypass, path traversal, security headers middleware
  • ✅ Unified upload pipeline (IUploadService) shared by API and MCP
  • ✅ File type allowlist for uploads
  • ✅ Rate limiting middleware (per-user and per-IP)
  • ✅ Bulk MCP tools: bulk_upload and bulk_delete
  • ✅ CLI improvements: files commands, container stats, --pre updates, --help flags
  • ✅ Self-hosted fonts (no CDN dependencies)
  • ✅ Docker release package on ghcr.io

Future

  • v0.4.0: Communication connectors (Slack, Discord)
  • v0.5.0: Knowledge platform connectors (Notion, Confluence, GitHub)
  • v1.0.0: Production-ready stable release

❓ FAQ

Does Connapse require internet access? — No. Use Ollama for fully offline embeddings and search.

How many documents can it handle? — Thousands per container. Built on PostgreSQL + pgvector.

Which MCP clients work with Connapse? — Any client supporting Streamable HTTP transport — Claude Desktop, Claude Code, VS Code, Cursor, and others.

Is my data private? — Fully self-hosted. With Ollama, nothing leaves your machine. Cloud providers (OpenAI, Azure) are optional.

What embedding providers are supported? — Ollama (local), OpenAI, and Azure OpenAI. Switch at runtime without re-deploying.


🤝 Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

Quick contribution checklist:

  • Fork the repo and create a feature branch
  • Follow code conventions in CONTRIBUTING.md
  • Write tests for new features (xUnit + FluentAssertions)
  • Ensure all tests pass: dotnet test
  • Update documentation if needed
  • Submit a pull request

Good first issues: Check issues labeled good-first-issue


📄 License

This project is licensed under the MIT License - see LICENSE for details.

You are free to:

  • ✅ Use commercially
  • ✅ Modify
  • ✅ Distribute
  • ✅ Sublicense
  • ✅ Use privately

The only requirement is to include the copyright notice and license in any substantial portions of the software.


💬 Support & Community


🙏 Acknowledgments

Built with:


⭐ If you find this project useful, please star the repository to show your support!

Yorumlar (0)

Sonuc bulunamadi