Awesome AI Agents

A curated list of AI agent frameworks, tools, platforms, research papers, and resources.

AI Agents are autonomous systems that use LLMs to reason, plan, and take actions. This list tracks the rapidly evolving ecosystem.

Contributing: PRs welcome! Read the contribution guidelines first.

Frameworks & Libraries
Platforms & Low-Code
Agent Infrastructure
Evaluation & Testing
Safety & Governance
Research Papers
Tutorials & Courses
Use Cases & Case Studies
Community

Frameworks & Libraries

Multi-Agent Orchestration

AG2 — Successor to AutoGen. Multi-agent framework with improved APIs.
Agent Swarm — Multi-agent orchestration for AI coding assistants (Claude Code, Codex, Gemini CLI). Lead/worker coordination with Docker isolation, compounding memory, and Slack/GitHub integration.
AgentField — Open-source control plane that makes AI agents callable as microservices. Routing, coordination, memory, async execution, and cryptographic audit trails. Supports Python, Go, and TypeScript.
AgentScope — Alibaba's production-ready agent framework with essential abstractions, built-in fine-tuning support, and a visual drag-and-drop interface.
Agno — Programming language for agentic software. Build and manage multi-agent systems at scale.
AutoGen — Microsoft's multi-agent conversation framework. Supports complex agent topologies.
CAMEL — Communicative agents for role-playing and multi-agent cooperation. First LLM multi-agent framework.
CopilotKit — Frontend stack for building agent-powered apps with Generative UI. React, Angular, and mobile support; creators of the AG-UI Protocol (adopted by Google, LangChain, AWS, Microsoft, Mastra, and PydanticAI). MIT.
CrewAI — Role-based multi-agent framework. Agents with roles, goals, and backstories.
DeerFlow — ByteDance's open-source long-horizon SuperAgent harness. Orchestrates sub-agents, sandboxes, memory, tools, and skills for tasks spanning minutes to hours. Hit #1 GitHub Trending with v2.0 (Feb 2026).
dimos — Agentic operating system for physical space. Build multi-agent systems that control humanoids, quadrupeds, drones, and other hardware via natural language.
Google Agent Development Kit (ADK) — Google's open-source, code-first Python framework for building multi-agent systems with A2A support.
Harmonist — Portable AI agent orchestration with mechanical protocol enforcement. 186 agents, zero runtime dependencies.
LangGraph — Stateful agent workflows as graphs. Part of the LangChain ecosystem.
Mastra — TypeScript-first AI agent framework with workflows, RAG, and integrations.
MetaGPT — Multi-agent framework that mimics a software company with roles (PM, architect, engineer).
Microsoft Agent Framework — Framework for building, orchestrating and deploying multi-agent workflows (Python + .NET).
MiroFish — Concise and universal swarm intelligence engine for forecasting and prediction. Upload seed material, describe goals in natural language, get a detailed prediction report and an interactive simulation.
OpenAI Agents SDK — OpenAI's production framework for multi-agent orchestration with handoffs and guardrails.
Ruflo — Agent orchestration platform optimized for Claude. Features self-learning swarms, distributed intelligence, RAG integration, and native Claude Code/Codex integration. Formerly claude-flow.
Semantic Kernel — Microsoft's SDK for AI orchestration. Plugins, planners, and memory.
Strands Agents — AWS's model-driven, open-source SDK for building production AI agents in Python and TypeScript. Any model, any cloud, with MCP support and 23M+ monthly PyPI downloads. Apache-2.0.
Swarm — OpenAI's lightweight multi-agent framework (educational).

Single Agent

Claude Agent SDK — Anthropic's production SDK for building AI agents powered by Claude. Stateful sessions, tool execution, sandboxing, and streaming. Available in Python and TypeScript.
GenericAgent — Self-evolving agent that grows its own skill tree from ~3K lines of seed code. 9 atomic tools for full system control (browser, terminal, filesystem, screen vision) with automatic skill crystallization.
Haystack — End-to-end NLP framework with agent pipelines.
Instructor — Structured output from LLMs. Essential for reliable tool use.
LangChain — The most popular LLM application framework. Agents, chains, tools.
LlamaIndex — Data framework for LLM apps. Strong RAG and data agent support.
PydanticAI — GenAI agent framework, the Pydantic way. Type-safe and production-ready.
smolagents — Hugging Face's lightweight agent library. ~1,000 lines of focused code, easy to understand and extend.

Code Agents

Aider — AI pair programming in the terminal.
Claude Code — Anthropic's agentic coding tool. Terminal-based, strong at complex refactors and multi-file changes.
Codex — OpenAI's cloud-based coding agent. Runs tasks in sandboxed environments, integrates with GitHub.
Cursor — AI-first code editor with agent capabilities.
Devin — Cognition's autonomous software engineer. Full environment with browser, editor, and terminal.
Gemini CLI — Open-source AI agent bringing Gemini directly into your terminal.
GitHub Copilot — AI pair programmer with agent mode for multi-file edits, terminal commands, and autonomous task execution.
Goose — Block's open-source extensible AI agent for full-cycle development. Desktop app, CLI, and API with native MCP support, 70+ extensions, and LLM-agnostic design. Now under the Linux Foundation's Agentic AI Foundation (AAIF).
Kiro — AWS's spec-driven AI coding IDE. Three-phase Specify, Plan, Execute workflow.
mini-coding-agent — Sebastian Raschka's minimal, readable Python coding agent harness. Explains the core components of coding agents in a small, hackable codebase.
OpenClacky — Token-efficient open-source AI coding agent with prompt caching, 16 core tools, and skill extensions. MIT.
OpenCode — Open-source terminal-native AI coding agent built in Go by SST. 160K+ stars, 7.5M monthly developers. Works in terminal, IDE, or desktop with any LLM.
Open SWE — LangChain's open-source async cloud coding agent. Connects to GitHub repos, delegates tasks from issues via Slack or Linear.
OpenHands — AI software development agent (formerly OpenDevin).
OpenHands Software Agent SDK — Modular Python SDK for building code agents. Local or ephemeral workspaces, composable tools, powers OpenHands CLI and Cloud.
SWE-agent — Princeton's software engineering agent.
Windsurf — AI-native IDE by Codeium with agentic Cascade flows.

Personal AI Agents

CoPaw — Alibaba's open-source personal AI agent workstation. Supports multi-channel workflows, MCP skills, local/cloud LLMs, and persistent memory.
Hermes Agent — Nous Research's open-source self-improving personal AI agent. Closed learning loop, multi-platform gateway (Telegram, Discord, Slack, WhatsApp, Signal), MCP integration, and cron scheduling.
Mercury Agent — Soul-driven personal AI agent with permission-hardened tools, token budgets, and multi-channel access (CLI or Telegram). MIT.
nanobot — HKU Data Science Lab's ultra-lightweight personal AI agent. Keeps the core small and readable while shipping a full workbench: WebUI, sustained goals, MCP, image generation, multi-channel (Telegram, Signal, Matrix), and multi-provider model routing. Released v0.2.1 (June 2026).
OpenHuman — Local-first personal AI agent with 118 OAuth integrations, hierarchical memory tree, and TokenJuice compression. Runs entirely on-device.
OpenClaw — Open-source personal AI agent with tool use, browser control, messaging integration, and persistent memory.
QwenPaw — Alibaba's Qwen-powered personal AI agent workstation. Local or cloud deployment, multi-agent collaboration with sub-agent spawning, extensible skill system, and broad channel support (DingTalk, Feishu, WeChat, Discord, Telegram). MIT.
Trustclaw — ComposioHQ's self-hostable personal AI agent with vector memory, native Composio tool integrations, and a Telegram front-end. MIT.

Browser Agents

Browser Use — Control browsers with AI agents. Most popular browser automation framework.
Playwright MCP — Anthropic's browser automation via MCP.
Stagehand — AI-powered browser automation framework by Browserbase.
UI-TARS Desktop — ByteDance's multimodal AI agent stack for desktop automation.

Research Agents

GPT Researcher — Autonomous agent for deep research on any topic using any LLM.
autoresearch — Andrej Karpathy's open-source framework for running AI agents that autonomously conduct research on single-GPU model training experiments overnight.
Perplexica — Open-source AI-powered answering engine (Perplexity alternative).

Platforms & Low-Code

Activepieces — Open-source AI workflow automation with 400+ MCP servers for agents.
Amazon Bedrock Agents — AWS managed agent service.
AnythingLLM — All-in-one desktop & Docker AI app with built-in RAG, agents, and MCP.
Anthropic Claude + Tool Use — Claude's function calling and agent capabilities.
Claude Managed Agents — Anthropic's hosted agent execution environment (public beta, April 2026). Stateful sessions, built-in sandboxing, and tool execution without managing your own infrastructure.
Azure AI Foundry — Full-stack AI platform with agent capabilities.
Composio — 1000+ toolkits, auth management, and sandboxed workbench for AI agents.
Dify — Open-source LLMOps platform with visual agent builder.
Google Vertex AI Agent Builder — Google Cloud's agent development platform.
MaxKB — Open-source platform for building enterprise-grade agents.
Microsoft Copilot Studio — Low-code agent builder. Integrates with M365, Dynamics, Power Platform.
n8n — Workflow automation with native AI agent capabilities and MCP support.
OpenAI Assistants API — OpenAI's managed agent platform with tools and retrieval.
Relevance AI — No-code AI agent platform.
Trigger.dev — Build and deploy fully-managed AI agents and workflows.

Agent Infrastructure

Tool Protocols

Agent2Agent Protocol (A2A) — Google's open protocol for agent-to-agent communication and discovery. Linux Foundation project.
Context7 — MCP server for up-to-date code documentation for LLMs.
GitHub MCP Server — GitHub's official MCP server for AI agents.
Model Context Protocol (MCP) — Anthropic's standard for connecting AI to tools and data.
OpenAI Function Calling — De facto standard for LLM tool use.

Agent Skills & Tools

codegraph — Pre-indexed code knowledge graph for coding agents (Claude Code, Codex, Cursor, Gemini CLI). Fewer tokens, fewer tool calls, 100% local.
dotnet/skills — Microsoft .NET team's curated skills and custom agents for AI coding agents. Core .NET, EF data access, diagnostics, MSBuild, and NuGet plugins.
ECC — Agent harness performance system for Claude Code, Cursor, Codex, OpenCode, Gemini, and Zed. Ships 63 specialized agents, 251 skills, continuous learning with session memory hooks, and AgentShield security hardening. MIT.
Google Agents CLI — CLI and skill pack that turns any coding assistant (Claude Code, Codex, Gemini CLI, Cursor) into an expert at creating, evaluating, and deploying AI agents on Google Cloud.
NotFair — Open-source Claude Code agent skills for SEO and paid ads, connecting to live data via Google Ads MCP, Meta Ads MCP, Google Search Console MCP, and Google Analytics (GA4) MCP.
olcli — Overleaf CLI for AI coding agents. Sync, pull, push, and compile LaTeX projects from the command line.
PowerSkills — PowerShell automation toolkit for AI agents. Structured JSON control over Windows — Outlook, Edge browser, desktop, and system operations.
Superpowers — Agentic skills framework and software development methodology for coding agents. Enforces design-before-code, tests-before-features workflows. Works with Claude Code, Codex, Gemini CLI, OpenCode, Cursor, and GitHub Copilot.
Xquik — Agent skill and MCP server for X data workflows.

Memory & State

Headroom — Context compression layer for AI agents. Reduces tool outputs, logs, RAG chunks, and files by 60–95% before they reach the LLM—without loss of answer quality. Library, proxy, and MCP server modes; reversible compression; supports Claude Code, Codex, Cursor, and Aider.
Hindsight — Agent memory that learns: state-of-the-art memory layer for AI agents with persistent, personalized recall.
Letta — Stateful agents with long-term memory (formerly MemGPT).
Mem0 — Universal memory layer for AI agents. Persistent, contextual.
Memori — Agent-native memory infrastructure. LLM-agnostic layer that turns agent execution and conversation into structured, persistent state for production systems.
ReMe — Alibaba's memory management kit for agents (formerly MemoryScope). File-based and vector-based memory with a dynamic procedural memory framework.
Zep — Long-term memory for AI assistants.

Monitoring & Observability

Arize Phoenix — ML & LLM observability.
Future AGI — Open-source, end-to-end, self-hostable platform for evaluating, observing, and improving LLM and AI agent apps. Tracing, evals, simulations, datasets, gateway, and guardrails in one stack.
Helicone — LLM observability and cost tracking.
Langfuse — Open-source LLM observability. Traces, evals, prompt management.
LangSmith — LangChain's debugging and monitoring platform.

Data Extraction

Crawl4AI — Open-source LLM-friendly web crawler. High-performance async crawling.
Firecrawl — Turn entire websites into LLM-ready markdown or structured data.

Vector Databases

Azure AI Search — Enterprise search with vector + hybrid capabilities.
ChromaDB — Lightweight embedding database.
Pinecone — Managed vector database.
Qdrant — High-performance vector search.
Weaviate — Open-source vector database.

Sandboxing & Execution

Daytona — Secure and elastic infrastructure for running AI-generated code.
E2B — Cloud sandboxes for AI agents. Secure code execution environments.
forkd — fork() for AI agent microVMs. Spawn 100 children in ~100ms from a warm parent, BRANCH a live VM in ~150ms. KVM-isolated with snapshot copy-on-write. Apache-2.0.
CubeSandbox — Tencent Cloud's instant, concurrent, secure, and lightweight Rust-based sandbox for AI agents. Sub-second cold start with strong isolation for tool execution and code interpreters.
GitHub Agentic Workflows — AI agents running within GitHub Actions. Markdown-based workflow definitions.
Moltworker — Cloudflare's open-source framework for deploying personal AI agents on Workers with sandboxed execution.
Mirage — Unified virtual filesystem for AI agents. Gives agents a consistent, sandboxed view across local, cloud, and ephemeral storage. Apache-2.0.
NemoClaw — NVIDIA's open-source reference stack for running always-on agents (OpenClaw, Hermes) more securely inside NVIDIA OpenShell sandboxes. Provides guided onboarding, hardened blueprints, routed inference, network policy, and lifecycle management via a single CLI. Announced at GTC Taipei (June 2026).
SmolVM — Open-source microVM sandbox infrastructure for code execution, browser use, and AI agents. macOS/Linux support, snapshotting, pause/resume, and persistent environments across turns. Apache-2.0.

Evaluation & Testing

AgentBench — Tsinghua's multi-dimensional agent benchmark.
AgentBoard — Multi-round agent evaluation platform.
GAIA — General AI Assistants benchmark by Meta.
LangTest — Testing framework for delivering safe & effective language models.
RuLES — Benchmark for evaluating rule-following in language models.
SWE-bench — Benchmark for software engineering agents.
ToolBench — Benchmark for tool-use capabilities.
ToolEmu — LM-based emulation framework for identifying risks of agents with tool use (ICLR '24).
UQLM — Uncertainty quantification for LLMs. UQ-based hallucination detection.

Safety & Governance

Agent Governance Toolkit — Microsoft's runtime governance infrastructure for AI agents. Deterministic policy enforcement, zero-trust identity, execution sandboxing, and SRE. Covers all 10 OWASP Agentic Top 10 risks across Python, TypeScript, .NET, Rust, and Go.
Agentic Security — LLM vulnerability scanner and AI red teaming kit.
Anthropic Constitutional AI — Self-improving AI safety through constitutions.
Azure AI Content Safety — Content moderation for AI outputs.
Deepsec — Vercel Labs' security harness for finding vulnerabilities in your codebase powered by coding agents. Apache-2.0.
Guardrails AI — Validation framework for LLM outputs.
IronCurtain — Open-source security layer for autonomous AI agents. Runs agents in isolated VMs to prevent prompt injection and rogue behavior.
LangFair — Python library for LLM bias and fairness assessments.
LLM Guard — Security toolkit for LLM interactions.
NeMo Guardrails — NVIDIA's programmable guardrails.
PromptInject — Framework for quantitative analysis of LLM robustness to prompt attacks (NeurIPS '22 Best Paper).
Rebuff — Prompt injection detection.
Safe RLHF — Constrained value alignment via safe reinforcement learning from human feedback.

Research Papers

Surveys & Overviews

The Rise and Potential of Large Language Model Based Agents (2023) — Comprehensive survey of LLM-based agents.
A Survey on Large Language Model based Autonomous Agents (2023) — Systematic review of agent architectures.
Agent AI: Surveying the Horizons of Multimodal Interaction (2024) — Microsoft Research survey on agent AI.

Agent Architectures

ReAct: Synergizing Reasoning and Acting (2023) — The foundational Reason + Act paradigm.
Toolformer (2023) — Teaching LLMs to use tools autonomously.
Voyager (2023) — Lifelong learning agent in Minecraft.
Generative Agents (2023) — Stanford's believable simulacra of human behavior.
Tree of Thoughts (2023) — Deliberate problem solving through exploration of reasoning paths.
Self-Refine (2023) — Iterative self-refinement with self-feedback.

Multi-Agent Systems

CAMEL (2023) — Communicative agents for role-playing.
MetaGPT (2023) — Multi-agent collaboration mimicking software companies.
ChatDev (2023) — Agents collaborating in a virtual software company.
PaperOrchestra (2026) — Google's multi-agent framework for automated AI research paper writing, converting unstructured pre-writing materials into submission-ready papers.

Safety & Evaluation

AgentBench (2023) — Evaluating LLMs as agents across 8 environments.
InjectAgent (2024) — Indirect prompt injection attacks on tool-integrated agents.
R-Judge (2024) — Benchmarking safety risk awareness for LLM agents.

Agent Training

Group-in-Group Policy Optimization for LLM Agent Training (2025) — RL-based training for LLM/VLM agents.

Tutorials & Courses

agent-rules-books — AGENTS.md rules and skills for AI coding agents (Codex, Cursor, Claude Code) inspired by Clean Code, Refactoring, DDD, Clean Architecture, and DDIA.
DeepLearning.AI: A2A Protocol — Short course on Google's Agent2Agent protocol.
DeepLearning.AI: Building Agentic RAG — Andrew Ng's course on agentic RAG patterns.
Hugging Face: Building AI Agents — Open course on agent development.
LangChain Academy — Free courses on agents and RAG.
Microsoft: AI Agents for Beginners — 12 lessons to get started building AI agents.
Microsoft: AI Engineering Coach — Microsoft's open-source curriculum and tooling for "better agentic engineering" — patterns, practices, and exercises for building production-quality AI agents.
Microsoft: Build AI Agents with Azure AI Foundry — Official Microsoft Learn path.
Microsoft: MCP for Beginners — Curriculum for Model Context Protocol with cross-language examples.
Prompt Engineering Guide — Comprehensive guides for prompt engineering, RAG, and AI agents.

Use Cases & Case Studies

Enterprise

IT Helpdesk Agents — Automated ticket resolution, password resets
Customer Service — Multi-turn conversation with CRM integration
Document Intelligence — Contract analysis, compliance checking
Data Analysis — Natural language to SQL, automated reporting

Research & Humanitarian

Disinformation Detection — Agents monitoring information ecosystems
Disaster Response — Coordinating information flows in crisis situations
Knowledge Management — Intelligent document retrieval for NGOs

Community

r/AI_Agents — Reddit community
AI Agents Discord — Active Discord server
awesome-ai-agent-papers — Curated collection of AI agent research papers released in 2026, covering engineering, memory, evaluation, workflows, and autonomous systems.
#AIAgents on X — Twitter/X hashtag

License

Disclaimer: This list aims to be vendor-neutral and community-driven. Inclusion does not imply endorsement by any employer or organization.

awesome-ai-agents