awesome-proactive-agent

agent
Security Audit
Warn
Health Warn
  • No license — Repository has no license file
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Low visibility — Only 5 GitHub stars
Code Warn
  • Code scan incomplete — No supported source files were scanned during light audit
Permissions Pass
  • Permissions — No dangerous permissions requested

No AI report is available for this listing yet.

SUMMARY

A curated list of papers, benchmarks, project pages, and code for proactive agents.

README.md

Awesome Proactive Agents dynamic banner

Awesome Proactive Agents

Proactive Agent banner

Awesome PRs Welcome Focus Scope

A curated research map for proactive agents: AI systems that infer latent user needs, decide when to intervene, ask for missing context or consent, and initiate useful assistance before a complete explicit command.

If this list is useful, a ⭐ helps others find it.


Contents


Scope

This list prioritizes papers where proactivity is a central research target. The list is broader than computer-use agents: it includes proactive dialogue, planning, recommendation, wearable assistance, GUI/mobile/OS agents, programming assistants, personalization, memory, benchmarks, optimization, and human factors.

Typical inclusion signals:

  • The agent predicts latent intent or missing context before a complete user instruction.
  • The agent decides when to ask, suggest, remind, intervene, execute, or stay silent.
  • The paper evaluates proactive behavior, intervention timing, user control, consent, interruption cost, or personalization.
  • The benchmark or dataset makes proactivity the primary task rather than a side effect of general tool use.

Resource labels:

  • Paper: arXiv, ACL Anthology, DOI, OpenReview, ACM, Springer, or official proceedings page.
  • Website: project page, conference page, lab page, or documentation.
  • Code / Dataset: GitHub, released code, released benchmark, or released dataset.
  • Notes: short English decision card with why the paper matters, proactivity signal, evaluation setup, limitations, and use cases.

Must Read

Selected starting points for understanding the field.

Date Paper Why read it first Resources
2024-04 Towards Human-centered Proactive Conversational Agents Establishes the human-centered dimensions of proactive agents: intelligence, adaptivity, and civility. arXiv DOI
2024-10 Proactive Agent Canonical shift from reactive LLM agents to active assistance over event streams; introduces ProactiveBench. arXiv OpenReview Star Notes
2024-10 Need Help? Strong user-study reference for proactive IDE assistance and intervention timing. arXiv Notes
2025-05 ContextAgent Extends proactive agents to open-world sensory contexts and tool calling. arXiv Website Star Notes
2026-02 ProAgentBench Real workflow logs reveal why synthetic proactive data can overestimate performance. arXiv Code Notes
2026-04 KnowU-Bench Closest benchmark to proactive, personalized, consent-aware mobile assistants. arXiv HF Paper Star Notes
2026-05 π-Bench Sharp long-horizon benchmark for hidden-intent resolution in personal assistant workflows. arXiv Website Star Dataset Notes

Papers

Foundations, Surveys and Human Factors

Date Title Venue / Source Tags Resources
2024-04 Towards Human-centered Proactive Conversational Agents SIGIR 2024 Definition · Human Factors · Dialogue arXiv DOI
2024-10 Redefining Proactivity for Information Seeking Dialogue SICON 2024 Definition · Dialogue · Intent Inference ACL
2025-01 When AI-Based Agents Are Proactive: Implications for Competence and System Satisfaction in Human-AI Collaboration BISE 2026 Human Factors · Intervention Timing · Trust DOI
2025-02 Assistance or Disruption? Exploring and Evaluating the Design and Trade-offs of Proactive AI Programming Support CHI 2025 Human Factors · Intervention Timing · IDE arXiv DOI Notes
2025-03 Proactive Conversational AI: A Comprehensive Survey of Advancements and Opportunities ACM TOIS 2025 Survey · Definition · Dialogue DOI
2026-01 Developer Interaction Patterns with Proactive AI: A Five-Day Field Study arXiv 2601 Human Factors · Real-world Data · IDE arXiv Notes
2026-02 From Fragmentation to Integration: Exploring the Design Space of AI Agents for Human-as-the-Unit Privacy Management arXiv 2602 Safety & Consent · Privacy · Human Factors arXiv
2026-02 Exploring The Impact of Proactive Generative AI Agent Roles in Time-Sensitive Collaborative Problem-Solving Tasks arXiv 2602 Human Factors · Collaboration · Intervention Timing arXiv

Proactive Interaction and Planning

Date Title Venue / Source Tags Resources
2024-03 ProMISe: A Proactive Multi-turn Dialogue Dataset for Information-seeking Intent Resolution Findings of EACL 2024 Clarification · Dialogue · Benchmark ACL
2024-06 Ask-before-Plan: Proactive Language Agents for Real-World Planning arXiv 2406 Clarification · Planning · Intent Inference arXiv
2024-10 Proactive Agent: Shifting LLM Agents from Reactive Responses to Active Assistance ICLR 2025 Intent Inference · Benchmark · Desktop arXiv OpenReview Star Notes
2025-01 Proactive Conversational Agents with Inner Thoughts CHI 2025 Dialogue · Intent Inference · Intervention Timing arXiv Star Notes
2025-01 ProTOD: Proactive Task-oriented Dialogue System Based on LLMs COLING 2025 Dialogue · Planning · Tool Use ACL
2025-07 Tunable LLM-based Proactive Recommendation Agent ACL 2025 Recommendation · Personalization · Intent Inference ACL
2025-09 PRINCIPLES: Synthetic Strategy Memory for Proactive Dialogue Agents Findings of EMNLP 2025 Dialogue · Memory · Simulation arXiv
2025-10 ProMediate: A Socio-cognitive Framework for Evaluating Proactive Agents in Multi-party Negotiation arXiv 2510 Dialogue · Collaboration · Benchmark arXiv
2026-01 Proactivity-driven Personalized Agents for Advancing Human Learning through Engagement, Reflection, and Self-Efficacy arXiv 2601 Personalization · Intent Inference · Education arXiv
2026-01 Long-term Task-oriented Agent: Proactive Long-term Intent Maintenance in Dynamic Environments arXiv 2601 Long-horizon · Intent Inference · Benchmark arXiv Notes
2026-05 Anticipate and Learn: Unleashing Idle-Time Compute in Proactive Agents arXiv 2605 Intent Inference · Memory · Benchmark arXiv Website Star Notes

GUI, Mobile, OS and Coding Agents

Date Title Venue / Source Tags Resources
2024-10 Need Help? Designing Proactive AI Assistants for Programming CHI 2025 IDE · Intervention Timing · Human Factors arXiv Notes
2025-03 CodingGenie: A Proactive LLM-Powered Programming Assistant arXiv 2503 IDE · Intent Inference · Tool Use arXiv Star Notes
2025-07 FingerTip 20K: A Benchmark for Proactive and Personalized Mobile LLM Agents ICLR 2026 Mobile · Personalization · Benchmark arXiv Star Notes
2025-08 AppAgent-Pro: A Proactive GUI Agent System for Multidomain Information Integration and User Assistance CIKM 2025 GUI · Intent Inference · Tool Use arXiv Star Notes
2025-09 VeriOS: Query-Driven Proactive Human-Agent-GUI Interaction for Trustworthy OS Agents arXiv 2509 OS · Safety & Consent · Clarification arXiv Star Notes
2026-02 ProAgentBench: Evaluating LLM Agents for Proactive Assistance with Real-World Data arXiv 2602 Real-world Data · Intervention Timing · Benchmark arXiv Code Notes
2026-02 ProactiveMobile: A Comprehensive Benchmark for Boosting Proactive Intelligence on Mobile Devices arXiv 2602 Mobile · Intent Inference · Benchmark arXiv Notes
2026-03 PIRA-Bench: A Transition from Reactive GUI Agents to GUI-based Proactive Intent Recommendation Agents arXiv 2603 GUI · Intent Inference · Benchmark arXiv Website Dataset Notes
2026-04 Help Without Being Asked: A Deployed Proactive Agent System for On-Call Support with Continuous Self-Improvement arXiv 2604 Real-world Data · Intervention Timing · Skill Learning arXiv Star Notes
2026-04 Proactive Agent Research Environment: Simulating Active Users to Evaluate Proactive Assistants arXiv 2604 Simulation · Intervention Timing · Benchmark arXiv Website Star Notes
2026-04 KnowU-Bench: Towards Interactive, Proactive, and Personalized Mobile Agent Evaluation arXiv 2604 Mobile · Personalization · Safety & Consent arXiv HF Paper Star Notes
2026-05 An Empirical Study of Proactive Coding Assistants in Real-World Software Development arXiv 2605 IDE · Real-world Data · Benchmark arXiv Notes
2026-05 ToolCUA: Towards Optimal GUI-Tool Path Orchestration for Computer Use Agents arXiv 2605 GUI · Tool Use · Optimization arXiv Star Notes
2026-04 From Reactive to Proactive: Assessing the Proactivity of Voice Agents via ProVoice-Bench Interspeech 2026 Multimodal / Wearable · Intervention Timing · Benchmark arXiv Notes
2026-06 Perceive Before Reasoning: A Pre-Reasoning Perception Framework for Efficient and Reliable Proactive Mobile Agents arXiv 2606 Mobile · Intervention Timing · Tool Use arXiv Notes

Multimodal, Wearable and Embodied Agents

Date Title Venue / Source Tags Resources
2024-09 AssistantX: An LLM-Powered Proactive Assistant in Collaborative Human-Populated Environments arXiv 2409 Embodied · Collaboration · Planning arXiv
2025-01 YETI: Proactive Interventions by Multimodal AI Agents in Augmented Reality Tasks arXiv 2501 Multimodal / Wearable · Intervention Timing · Human Factors arXiv Website Notes
2025-01 AiGet: Transforming Everyday Moments into Hidden Knowledge Discovery with AI Assistance on Smart Glasses CHI 2025 Multimodal / Wearable · Intent Inference · Personalization arXiv DOI
2025-02 Mirai: A Wearable Proactive AI Inner-Voice for Contextual Nudging CHI EA 2025 Multimodal / Wearable · Intervention Timing · Human Factors arXiv DOI
2025-05 ContextAgent: Context-Aware Proactive LLM Agents with Open-World Sensory Perceptions NeurIPS 2025 Multimodal / Wearable · Personalization · Tool Use arXiv Website Star Notes
2025-06 Proactive Assistant Dialogue Generation from Streaming Egocentric Videos EMNLP 2025 Multimodal / Wearable · Dialogue · Intervention Timing arXiv
2025-12 ProAgent: Harnessing On-Demand Sensory Contexts for Proactive LLM Agent Systems arXiv 2512 Multimodal / Wearable · Sensing · Intervention Timing arXiv Video Notes
2026-03 ProactiveBench: Benchmarking Proactiveness in Multimodal Large Language Models ICLR 2026 Multimodal / Wearable · Intervention Timing · Benchmark arXiv Star Dataset Notes
2026-05 IPIBench: Evaluating Interactive Proactive Intelligence of MLLMs under Continuous Streams arXiv 2605 Multimodal / Wearable · Intervention Timing · Benchmark arXiv Notes
2026-05 MemEye: A Visual-Centric Evaluation Framework for Multimodal Agent Memory arXiv 2605 Memory · Multimodal / Wearable · Benchmark arXiv Notes

Benchmarks, Personalization and Optimization

Date Title Venue / Source Tags Resources
2025-08 ProactiveEval: A Unified Evaluation Framework for Proactive Dialogue Agents arXiv 2508 Benchmark · Dialogue · Intent Inference arXiv Star Notes
2025-09 ProPerSim: Developing Proactive and Personalized AI Assistants through User-Assistant Simulation ICLR 2026 Personalization · Simulation · Benchmark arXiv
2025-10 Beyond Reactivity: Measuring Proactive Problem Solving in LLM Agents arXiv 2510 Benchmark · Intent Inference · Tool Use arXiv Star Notes
2025-11 Training Proactive and Personalized LLM Agents arXiv 2511 Personalization · Optimization · Simulation arXiv Star Notes
2026-02 Pushing Forward Pareto Frontiers of Proactive Agents with Behavioral Agentic Optimization arXiv 2602 Optimization · Human Factors · Safety & Consent arXiv
2026-03 ProEvent: An Event-centric Benchmark for Proactive Agents OpenReview / ACL ARR 2026 Benchmark · Long-horizon · Intervention Timing OpenReview
2026-04 SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization arXiv 2604 Optimization · Skill Learning · Memory arXiv Star Notes
2026-05 CogniFold: Always-On Proactive Memory via Cognitive Folding arXiv 2605 Memory · Intent Inference · Benchmark arXiv HF Paper Star Dataset Notes
2026-05 MUSE-Autoskill: Self-Evolving Agents via Skill Creation, Memory, Management, and Evaluation arXiv 2605 Optimization · Skill Learning · Memory arXiv Notes
2026-05 π-Bench: Evaluating Proactive Personal Assistant Agents in Long-Horizon Workflows arXiv 2605 Long-horizon · Intent Inference · Benchmark arXiv Website Star Dataset Notes
2026-05 VitaBench 2.0: Evaluating Personalized and Proactive Agents in Long-Term User Interactions arXiv 2605 Long-horizon · Personalization · Memory arXiv HF Paper Star Notes
2026-06 Ψ-Bench: Evaluating Persona-Sensitive Influencing in Persuasive Dialogues arXiv 2606 Dialogue · Personalization · Benchmark arXiv Star Notes
2026-06 Communication Policy Evolution for Proactive LLM Agents arXiv 2606 Dialogue · Intervention Timing · Optimization arXiv Notes

Benchmarks

For detailed comparison, see BENCHMARKS.md.

Date Benchmark Paper Environment What it tests Resources
2024-03 ProMISe ProMISe information-seeking dialogue proactive clarification for intent resolution ACL
2024-10 RealHumanEval Need Help? programming tasks proactive IDE assistance with human users arXiv
2024-10 ProactiveBench Proactive Agent desktop activity events proactive task prediction and acceptance arXiv Star
2025-05 ContextAgentBench ContextAgent wearable sensory contexts proactive service prediction and tool calling arXiv Star
2025-07 FingerTip 20K FingerTip 20K Android trajectories proactive task suggestion and personalized execution arXiv Star
2025-08 ProactiveEval ProactiveEval proactive dialogue target planning and dialogue guidance across six domains arXiv Star
2025-10 PROBE Beyond Reactivity web problem-solving tasks bottleneck discovery and autonomous resolution arXiv Star
2025-11 UserVille Training Proactive and Personalized LLM Agents SWE and deep-research user simulation productivity, proactivity and personalization under vague prompts arXiv Star
2026-01 ChronosBench Long-term Task-oriented Agent dynamic task environments proactive long-term intent maintenance arXiv Notes
2026-02 ProAgentBench ProAgentBench real workflow logs when-to-assist and how-to-assist arXiv Code
2026-02 ProactiveMobile ProactiveMobile mobile device context latent intent to executable API sequence arXiv
2026-03 ProEvent ProEvent future event tracking proactive event maintenance and reminders OpenReview
2026-03 PIRA-Bench PIRA-Bench continuous GUI screenshots proactive GUI intent recommendation arXiv Website Dataset
2026-03 ProactiveBench (MLLM) ProactiveBench / Trento visual difficulty scenarios MLLM proactive help-seeking from visual context arXiv Dataset
2026-05 IPIBench IPIBench streaming video, multi-turn interactive proactive monitoring, task management, reactive-proactive coordination arXiv Notes
2026-04 ProVoice-Bench ProVoice-Bench voice interaction streams proactive voice intervention timing, over-triggering, monitoring arXiv Notes
2026-04 Pare-Bench Pare multi-app FSM environment active user simulation, intervention timing, multi-app execution arXiv Website Star
2026-04 KnowU-Bench KnowU-Bench Android emulator personalization, proactive tasks, consent and rejection handling arXiv Star
2026-05 CogEval-Bench CogniFold streaming event memory proactive concept emergence and cognitive-structure formation arXiv Dataset Star
2026-05 MemEye MemEye multimodal long-term memory visual evidence granularity and temporal state reasoning arXiv
2026-05 ProCodeBench Proactive Coding Assistants real IDE traces proactive coding intent prediction and sim-to-real evaluation arXiv
2026-05 π-Bench π-Bench persistent personal workspaces proactive hidden-intent resolution and checklist completion in long-horizon workflows arXiv Website Star Dataset
2026-05 ProActEval Anticipate and Learn proactive assistant scenarios idle-time anticipation, evidence acquisition, user effort and hallucination reduction arXiv Website Star
2026-05 VitaBench 2.0 VitaBench 2.0 long-term user interaction sequences preference extraction, memory use, updates, and proactive missing-information acquisition arXiv HF Paper Star
2026-06 Ψ-Bench Ψ-Bench persuasive dialogue persona-sensitive influencing with simulated clients and user profiles arXiv Star

Tag Vocabulary

Tags are intentionally compact and reusable. They describe the paper's main contribution, not every detail.

Tag Meaning
Definition Defines or reframes proactive agents, proactive dialogue, or design-space boundaries.
Survey Synthesizes a broad proactive-agent subfield or taxonomy.
Human Factors Studies interruption, control, satisfaction, workload, adoption, or developer experience.
Trust Focuses on competence perception, calibrated reliance, or trustworthy interaction.
Safety & Consent Covers confirmation, autonomy boundaries, reversibility, rejection, or risk control.
Privacy Centers privacy management, data minimization, or personal-context governance.
Intervention Timing Focuses on when an agent should act, ask, suggest, or remain silent.
Intent Inference Infers latent goals, hidden constraints, future tasks, or missing information.
Clarification Proactively asks questions before planning, execution, or recommendation.
Dialogue Proactive behavior in conversational, persuasive, or task-oriented interaction.
Planning Proactive decomposition, task planning, scheduling, or future-state reasoning.
Tool Use Tool calling, API execution, GUI operation, or action orchestration.
Recommendation Proactive recommendation or suggestion ranking.
Collaboration Multi-party or human-agent collaborative problem solving.
Education Learning, tutoring, reflection, or student engagement contexts.
Long-horizon Multi-session, dynamic, future-event, or long-running task maintenance.
Personalization User preferences, personas, profiles, long-term user history, or user-specific adaptation.
Memory Persistent memory, episodic memory, visual memory, skill memory, or cognitive memory structures.
Simulation User simulation, environment simulation, synthetic users, or synthetic workflows.
Optimization RL, reward modeling, multi-objective optimization, self-evolution, or behavior tuning.
Skill Learning Skill creation, skill internalization, skill memory, or reusable procedure learning.
Benchmark Introduces a dataset, evaluation suite, benchmark, simulator, or diagnostic protocol.
Real-world Data Uses real user traces, field-study data, or deployment-like logs.
Desktop Desktop activity streams, workstation context, or event logs.
GUI Graphical interface agents, browser/app screens, or visual UI interaction.
Mobile Mobile GUI, Android/iOS workflows, phone sensors, or mobile user context.
OS Operating-system agents, cross-app workflows, or OS-level verification.
IDE Programming assistants, code editors, or developer tooling.
Multimodal / Wearable Video, audio, AR, smart glasses, egocentric streams, or open-world sensory context.
Sensing Active context acquisition, sensor selection, or on-demand sensory capture.
Embodied Robots, physical environments, or human-populated embodied settings.

Contributing

Pull requests are welcome.

Before adding a paper, check that it satisfies at least one of:

  • It predicts latent user intent before a complete explicit instruction.
  • It decides when to intervene, ask, suggest, execute, remind, or stay silent.
  • It evaluates proactive assistance, interruption cost, user control, consent, or personalization.
  • It contributes a benchmark or dataset where proactivity is the primary task.

Suggested note template:

# Paper Title

## Why It Matters

...

## Proactivity Signal

...

## Evaluation Setup

...

## Key Limitations

...

## Use For

...

Maintained by Low Entropy AI.

Reviews (0)

No results found