Deprecated : The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
agent - a zzfive Collection
AgentOhana: Design Unified Data and Training Pipeline for Effective
Agent Learning Paper
• 2402.15506
• Published Feb 23, 2024 • 17
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web
Navigating Agent Paper
• 2404.03648
• Published Apr 4, 2024 • 29
Similarity is Not All You Need: Endowing Retrieval Augmented Generation
with Multi Layered Thoughts Paper
• 2405.19893
• Published May 30, 2024 • 34
Parrot: Efficient Serving of LLM-based Applications with Semantic
Variable Paper
• 2405.19888
• Published May 30, 2024 • 7
Mobile-Agent-v2: Mobile Device Operation Assistant with Effective
Navigation via Multi-Agent Collaboration Paper
• 2406.01014
• Published Jun 3, 2024 • 33
AgentGym: Evolving Large Language Model-based Agents across Diverse
Environments Paper
• 2406.04151
• Published Jun 6, 2024 • 24
τ-bench: A Benchmark for Tool-Agent-User Interaction in Real-World
Domains Paper
• 2406.12045
• Published Jun 17, 2024 • 9
Agentless: Demystifying LLM-based Software Engineering Agents Paper
• 2407.01489
• Published Jul 1, 2024 • 65
Internet of Agents: Weaving a Web of Heterogeneous Agents for
Collaborative Intelligence Paper
• 2407.07061
• Published Jul 9, 2024 • 28
Spider2-V: How Far Are Multimodal Agents From Automating Data Science
and Engineering Workflows? Paper
• 2407.10956
• Published Jul 15, 2024 • 7
Sibyl: Simple yet Effective Agent Framework for Complex Real-world
Reasoning Paper
• 2407.10718
• Published Jul 15, 2024 • 19
POGEMA: A Benchmark Platform for Cooperative Multi-Agent Navigation Paper
• 2407.14931
• Published Jul 20, 2024 • 22
AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks? Paper
• 2407.15711
• Published Jul 22, 2024 • 9
CoD, Towards an Interpretable Medical Agent using Chain of Diagnosis Paper
• 2407.13301
• Published Jul 18, 2024 • 55
OpenDevin: An Open Platform for AI Software Developers as Generalist
Agents Paper
• 2407.16741
• Published Jul 23, 2024 • 77
LAMBDA: A Large Model Based Data Agent Paper
• 2407.17535
• Published Jul 24, 2024 • 37
AppWorld: A Controllable World of Apps and People for Benchmarking
Interactive Coding Agents Paper
• 2407.18901
• Published Jul 26, 2024 • 35
MindSearch: Mimicking Human Minds Elicits Deep AI Searcher Paper
• 2407.20183
• Published Jul 29, 2024 • 43
GPUDrive: Data-driven, multi-agent driving simulation at 1 million FPS Paper
• 2408.01584
• Published Aug 2, 2024 • 10
Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in
Long-Horizon Tasks Paper
• 2408.03615
• Published Aug 7, 2024 • 31
CodexGraph: Bridging Large Language Models and Code Repositories via
Code Graph Databases Paper
• 2408.03910
• Published Aug 7, 2024 • 18
Automated Design of Agentic Systems Paper
• 2408.08435
• Published Aug 15, 2024 • 40
Paper Copilot: A Self-Evolving and Efficient LLM System for Personalized
Academic Assistance Paper
• 2409.04593
• Published Sep 6, 2024 • 26
Paper
• 2409.07429
• Published Sep 11, 2024 • 32
SUPER: Evaluating Agents on Setting Up and Executing Tasks from Research
Repositories Paper
• 2409.07440
• Published Sep 11, 2024 • 8
HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks
at Scale Paper
• 2409.16299
• Published Sep 9, 2024 • 11
MSI-Agent: Incorporating Multi-Scale Insight into Embodied Agents for
Superior Planning and Decision-Making Paper
• 2409.16686
• Published Sep 25, 2024 • 10
Tutor CoPilot: A Human-AI Approach for Scaling Real-Time Expertise Paper
• 2410.03017
• Published Oct 3, 2024 • 29
Agent S: An Open Agentic Framework that Uses Computers Like a Human Paper
• 2410.08164
• Published Oct 10, 2024 • 26
MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language
Models Paper
• 2410.11710
• Published Oct 15, 2024 • 20
Agent-as-a-Judge: Evaluate Agents with Agents Paper
• 2410.10934
• Published Oct 14, 2024 • 23
Revealing the Barriers of Language Agents in Planning Paper
• 2410.12409
• Published Oct 16, 2024 • 27
MobA: A Two-Level Agent System for Efficient Mobile Task Automation Paper
• 2410.13757
• Published Oct 17, 2024 • 32
Web Agents with World Models: Learning and Leveraging Environment
Dynamics in Web Navigation Paper
• 2410.13232
• Published Oct 17, 2024 • 44
AgentStore: Scalable Integration of Heterogeneous Agents As Specialized
Generalist Computer Assistant Paper
• 2410.18603
• Published Oct 24, 2024 • 32
AutoKaggle: A Multi-Agent Framework for Autonomous Data Science
Competitions Paper
• 2410.20424
• Published Oct 27, 2024 • 40
OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World
Exploration, Feedback and Optimization Paper
• 2410.19609
• Published Oct 25, 2024 • 18
Teaching Embodied Reinforcement Learning Agents: Informativeness and
Diversity of Language Use Paper
• 2410.24218
• Published Oct 31, 2024 • 6
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents Paper
• 2410.23218
• Published Oct 30, 2024 • 49
Adapting While Learning: Grounding LLMs for Scientific Problems with
Intelligent Tool Usage Adaptation Paper
• 2411.00412
• Published Nov 1, 2024 • 10
AndroidLab: Training and Systematic Benchmarking of Android Autonomous
Agents Paper
• 2410.24024
• Published Oct 31, 2024 • 49
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum
Reinforcement Learning Paper
• 2411.02337
• Published Nov 4, 2024 • 36
Thanos: Enhancing Conversational Agents with Skill-of-Mind-Infused Large
Language Model Paper
• 2411.04496
• Published Nov 7, 2024 • 22
GazeGen: Gaze-Driven User Interaction for Visual Content Generation Paper
• 2411.04335
• Published Nov 7, 2024 • 15
The Dawn of GUI Agent: A Preliminary Case Study with Claude 3.5 Computer
Use Paper
• 2411.10323
• Published Nov 15, 2024 • 34
Is Your LLM Secretly a World Model of the Internet? Model-Based Planning
for Web Agents Paper
• 2411.06559
• Published Nov 10, 2024 • 16
BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games Paper
• 2411.13543
• Published Nov 20, 2024 • 19
SketchAgent: Language-Driven Sequential Sketch Generation Paper
• 2411.17673
• Published Nov 26, 2024 • 18
Interleaved Scene Graph for Interleaved Text-and-Image Generation
Assessment Paper
• 2411.17188
• Published Nov 26, 2024 • 20
Large Language Model-Brained GUI Agents: A Survey Paper
• 2411.18279
• Published Nov 27, 2024 • 30
MALT: Improving Reasoning with Multi-Agent LLM Training Paper
• 2412.01928
• Published Dec 2, 2024 • 46
Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction Paper
• 2412.04454
• Published Dec 5, 2024 • 71
Unraveling the Complexity of Memory in RL Agents: an Approach for
Classification and Evaluation Paper
• 2412.06531
• Published Dec 9, 2024 • 72
The BrowserGym Ecosystem for Web Agent Research Paper
• 2412.05467
• Published Dec 6, 2024 • 24
AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web
Tutorials Paper
• 2412.09605
• Published Dec 12, 2024 • 30
Large Action Models: From Inception to Implementation Paper
• 2412.10047
• Published Dec 13, 2024 • 36
Evaluation Agent: Efficient and Promptable Evaluation Framework for
Visual Generative Models Paper
• 2412.09645
• Published Dec 10, 2024 • 36
Proposer-Agent-Evaluator(PAE): Autonomous Skill Discovery For Foundation
Model Internet Agents Paper
• 2412.13194
• Published Dec 17, 2024 • 12
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World
Tasks Paper
• 2412.14161
• Published Dec 18, 2024 • 51
Paper
• 2412.13501
• Published Dec 18, 2024 • 30
PC Agent: While You Sleep, AI Works -- A Cognitive Journey into Digital
World Paper
• 2412.17589
• Published Dec 23, 2024 • 14
Agent-SafetyBench: Evaluating the Safety of LLM Agents Paper
• 2412.14470
• Published Dec 19, 2024 • 13
Training Software Engineering Agents and Verifiers with SWE-Gym Paper
• 2412.21139
• Published Dec 30, 2024 • 26
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse
Task Synthesis Paper
• 2412.19723
• Published Dec 27, 2024 • 87
A3: Android Agent Arena for Mobile GUI Agents Paper
• 2501.01149
• Published Jan 2, 2025 • 22
Agent Laboratory: Using LLM Agents as Research Assistants Paper
• 2501.04227
• Published Jan 8, 2025 • 95
Search-o1: Agentic Search-Enhanced Large Reasoning Models Paper
• 2501.05366
• Published Jan 9, 2025 • 104
InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning
and Reflection Paper
• 2501.04575
• Published Jan 8, 2025 • 25
PaSa: An LLM Agent for Comprehensive Academic Paper Search Paper
• 2501.10120
• Published Jan 17, 2025 • 55
Agent-R: Training Language Model Agents to Reflect via Iterative
Self-Training Paper
• 2501.11425
• Published Jan 20, 2025 • 109
UI-TARS: Pioneering Automated GUI Interaction with Native Agents Paper
• 2501.12326
• Published Jan 21, 2025 • 64
Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks Paper
• 2501.11733
• Published Jan 20, 2025 • 28
FilmAgent: A Multi-Agent Framework for End-to-End Film Automation in
Virtual 3D Spaces Paper
• 2501.12909
• Published Jan 22, 2025 • 74
IntellAgent: A Multi-Agent Framework for Evaluating Conversational AI
Systems Paper
• 2501.11067
• Published Jan 19, 2025 • 13
CowPilot: A Framework for Autonomous and Human-Agent Collaborative Web
Navigation Paper
• 2501.16609
• Published Jan 28, 2025 • 7
QLASS: Boosting Language Agent Inference via Q-Guided Stepwise Search Paper
• 2502.02584
• Published Feb 4, 2025 • 16
Rethinking Mixture-of-Agents: Is Mixing Different Large Language Models
Beneficial? Paper
• 2502.00674
• Published Feb 2, 2025 • 13
MetaChain: A Fully-Automated and Zero-Code Framework for LLM Agents Paper
• 2502.05957
• Published Feb 9, 2025 • 16
InSTA: Towards Internet-Scale Training For Agents Paper
• 2502.06776
• Published Feb 10, 2025 • 9
Hephaestus: Improving Fundamental Agent Capabilities of Large Language
Models through Continual Pre-Training Paper
• 2502.06589
• Published Feb 10, 2025 • 21
EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language
Models for Vision-Driven Embodied Agents Paper
• 2502.09560
• Published Feb 13, 2025 • 35
OctoTools: An Agentic Framework with Extensible Tools for Complex
Reasoning Paper
• 2502.11271
• Published Feb 16, 2025 • 19
Autellix: An Efficient Serving Engine for LLM Agents as General Programs Paper
• 2502.13965
• Published Feb 19, 2025 • 19
TAG: A Decentralized Framework for Multi-Agent Hierarchical
Reinforcement Learning Paper
• 2502.15425
• Published Feb 21, 2025 • 9
Self-Taught Agentic Long Context Understanding Paper
• 2502.15920
• Published Feb 21, 2025 • 3
WebGames: Challenging General-Purpose Web-Browsing AI Agents Paper
• 2502.18356
• Published Feb 25, 2025 • 14
ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic
Iterative Reasoning Agents Paper
• 2502.18017
• Published Feb 25, 2025 • 21
PodAgent: A Comprehensive Framework for Podcast Generation Paper
• 2503.00455
• Published Mar 1, 2025 • 6
MPO: Boosting LLM Agents with Meta Plan Optimization Paper
• 2503.02682
• Published Mar 4, 2025 • 29
Agent models: Internalizing Chain-of-Action Generation into Reasoning
models Paper
• 2503.06580
• Published Mar 9, 2025 • 20
API Agents vs. GUI Agents: Divergence and Convergence Paper
• 2503.11069
• Published Mar 14, 2025 • 36
STEVE: AStep Verification Pipeline for Computer-use Agent Training Paper
• 2503.12532
• Published Mar 16, 2025 • 17
Survey on Evaluation of LLM-based Agents Paper
• 2503.16416
• Published Mar 20, 2025 • 96
Verbal Process Supervision Elicits Better Coding Agents Paper
• 2503.18494
• Published Mar 24, 2025 • 2
Large Language Model Agent: A Survey on Methodology, Applications and
Challenges Paper
• 2503.21460
• Published Mar 27, 2025 • 83
UI-R1: Enhancing Action Prediction of GUI Agents by Reinforcement
Learning Paper
• 2503.21620
• Published Mar 27, 2025 • 62
Classical Planning with LLM-Generated Heuristics: Challenging the State
of the Art with Python Code Paper
• 2503.18809
• Published Mar 24, 2025 • 9
Agent S2: A Compositional Generalist-Specialist Framework for Computer
Use Agents Paper
• 2504.00906
• Published Apr 1, 2025 • 27
Advances and Challenges in Foundation Agents: From Brain-Inspired
Intelligence to Evolutionary, Collaborative, and Safe Systems Paper
• 2504.01990
• Published Mar 31, 2025 • 305
AgentRewardBench: Evaluating Automatic Evaluations of Web Agent
Trajectories Paper
• 2504.08942
• Published Apr 11, 2025 • 28
Breaking the Data Barrier -- Building GUI Agents Through Task
Generalization Paper
• 2504.10127
• Published Apr 14, 2025 • 17
SocioVerse: A World Model for Social Simulation Powered by LLM Agents
and A Pool of 10 Million Real-World Users Paper
• 2504.10157
• Published Apr 14, 2025 • 17
The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via
Agentic Tree Search Paper
• 2504.08066
• Published Apr 10, 2025 • 22
Paper
• 2504.11442
• Published Apr 15, 2025 • 30
MLRC-Bench: Can Language Agents Solve Machine Learning Research
Challenges? Paper
• 2504.09702
• Published Apr 13, 2025 • 18
Exploring Expert Failures Improves LLM Agent Tuning Paper
• 2504.13145
• Published Apr 17, 2025 • 12
UFO2: The Desktop AgentOS Paper
• 2504.14603
• Published Apr 20, 2025 • 29
InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to
Deliberative Reasoners Paper
• 2504.14239
• Published Apr 19, 2025 • 14
LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making
Abilities Paper
• 2504.16078
• Published Apr 22, 2025 • 21
Paper2Code: Automating Code Generation from Scientific Papers in Machine
Learning Paper
• 2504.17192
• Published Apr 24, 2025 • 124
LLM-Powered GUI Agents in Phone Automation: Surveying Progress and
Prospects Paper
• 2504.19838
• Published Apr 28, 2025 • 23
Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory Paper
• 2504.19413
• Published Apr 28, 2025 • 52
RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn
Reinforcement Learning Paper
• 2504.20073
• Published Apr 24, 2025 • 12
Agentic Reasoning and Tool Integration for LLMs via Reinforcement
Learning Paper
• 2505.01441
• Published Apr 28, 2025 • 39
Think on your Feet: Adaptive Thinking via Reinforcement Learning for
Social Agents Paper
• 2505.02156
• Published May 4, 2025 • 18
Multi-Agent System for Comprehensive Soccer Understanding Paper
• 2505.03735
• Published May 6, 2025 • 25
OSUniverse: Benchmark for Multimodal GUI-navigation AI Agents Paper
• 2505.03570
• Published May 6, 2025 • 8
LLM-Independent Adaptive RAG: Let the Question Speak for Itself Paper
• 2505.04253
• Published May 7, 2025 • 14
AI Agents vs. Agentic AI: A Conceptual Taxonomy, Applications and
Challenge Paper
• 2505.10468
• Published May 15, 2025 • 10
Creating General User Models from Computer Use Paper
• 2505.10831
• Published May 16, 2025 • 5
Visual Agentic Reinforcement Fine-Tuning Paper
• 2505.14246
• Published May 20, 2025 • 32
NovelSeek: When Agent Becomes the Scientist -- Building Closed-Loop
System from Hypothesis to Verification Paper
• 2505.16938
• Published May 22, 2025 • 121
Distilling LLM Agent into Small Models with Retrieval and Code Tools Paper
• 2505.17612
• Published May 23, 2025 • 81
UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based
Mobile GUI Agents Paper
• 2505.21496
• Published May 27, 2025 • 38
WebDancer: Towards Autonomous Information Seeking Agency Paper
• 2505.22648
• Published May 28, 2025 • 33
Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and
Benchmarking Multimodal LLM Agents Paper
• 2505.24878
• Published May 30, 2025 • 23
GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents Paper
• 2506.03143
• Published Jun 3, 2025 • 54
TRiSM for Agentic AI: A Review of Trust, Risk, and Security Management
in LLM-based Agentic Multi-Agent Systems Paper
• 2506.04133
• Published Jun 4, 2025 • 3
ComfyUI-Copilot: An Intelligent Assistant for Automated Workflow
Development Paper
• 2506.05010
• Published Jun 5, 2025 • 80
Surfer-H Meets Holo1: Cost-Efficient Web Agent Powered by Open Weights Paper
• 2506.02865
• Published Jun 3, 2025 • 34
MedAgentGym: Training LLM Agents for Code-Based Medical Reasoning at
Scale Paper
• 2506.04405
• Published Jun 4, 2025 • 7
Agents of Change: Self-Evolving LLM Agents for Strategic Planning Paper
• 2506.04651
• Published Jun 5, 2025 • 8
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents Paper
• 2506.11763
• Published Jun 13, 2025 • 74
Scaling Test-time Compute for LLM Agents Paper
• 2506.12928
• Published Jun 15, 2025 • 63
OAgents: An Empirical Study of Building Effective Agents Paper
• 2506.15741
• Published Jun 17, 2025 • 35
SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via
Multi-Agent Multi-Turn Reinforcement Learning Paper
• 2506.24119
• Published Jun 30, 2025 • 51
WebSailor: Navigating Super-human Reasoning for Web Agent Paper
• 2507.02592
• Published Jul 3, 2025 • 126
PresentAgent: Multimodal Agent for Presentation Video Generation Paper
• 2507.04036
• Published Jul 5, 2025 • 11
Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving Paper
• 2507.06229
• Published Jul 8, 2025 • 77
MIRIX: Multi-Agent Memory System for LLM-Based Agents Paper
• 2507.07957
• Published Jul 10, 2025 • 80
GUI-G^2: Gaussian Reward Modeling for GUI Grounding Paper
• 2507.15846
• Published Jul 21, 2025 • 135
MCPEval: Automatic MCP-based Deep Evaluation for AI Agent Models Paper
• 2507.12806
• Published Jul 17, 2025 • 21
LLM Economist: Large Population Models and Mechanism Design in
Multi-Agent Generative Simulacra Paper
• 2507.15815
• Published Jul 21, 2025 • 7
MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI
Agents Paper
• 2507.19478
• Published Jul 25, 2025 • 33
A Survey of Self-Evolving Agents: On Path to Artificial Super
Intelligence Paper
• 2507.21046
• Published Jul 28, 2025 • 85
GenoMAS: A Multi-Agent Framework for Scientific Discovery via
Code-Driven Gene Expression Analysis Paper
• 2507.21035
• Published Jul 28, 2025 • 3
ScreenCoder: Advancing Visual-to-Code Generation for Front-End
Automation via Modular Multimodal Agents Paper
• 2507.22827
• Published Jul 30, 2025 • 101
Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent
Foundation Models Training Paper
• 2508.00414
• Published Aug 1, 2025 • 94
SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution Paper
• 2507.23348
• Published Jul 31, 2025 • 12
CellForge: Agentic Design of Virtual Cell Models Paper
• 2508.02276
• Published Aug 4, 2025 • 39
RoboMemory: A Brain-inspired Multi-memory Agentic Framework for Lifelong
Learning in Physical Embodied Systems Paper
• 2508.01415
• Published Aug 2, 2025 • 8
AgentTTS: Large Language Model Agent for Test-time Compute-optimal
Scaling Strategy in Complex Tasks Paper
• 2508.00890
• Published Jul 26, 2025 • 7
LiveMCPBench: Can Agents Navigate an Ocean of MCP Tools? Paper
• 2508.01780
• Published Aug 3, 2025 • 21
HyCodePolicy: Hybrid Language Controllers for Multimodal Monitoring and
Decision in Embodied Agents Paper
• 2508.02629
• Published Aug 4, 2025 • 6
Efficient Agents: Building Effective Agents While Reducing Cost Paper
• 2508.02694
• Published Jul 24, 2025 • 86
SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from
Experience Paper
• 2508.04700
• Published Aug 6, 2025 • 52
Training Long-Context, Multi-Turn Software Engineering Agents with
Reinforcement Learning Paper
• 2508.03501
• Published Aug 5, 2025 • 59
Enhancing Vision-Language Model Training with Reinforcement Learning in
Synthetic Worlds for Real-World Success Paper
• 2508.04280
• Published Aug 6, 2025 • 35
Agent Lightning: Train ANY AI Agents with Reinforcement Learning Paper
• 2508.03680
• Published Aug 5, 2025 • 140
Web-CogReasoner: Towards Knowledge-Induced Cognitive Reasoning for Web
Agents Paper
• 2508.01858
• Published Aug 3, 2025 • 20
CoAct-1: Computer-using Agents with Coding as Actions Paper
• 2508.03923
• Published Aug 5, 2025 • 13
OS Agents: A Survey on MLLM-based Agents for General Computing Devices
Use Paper
• 2508.04482
• Published Aug 6, 2025 • 9
WideSearch: Benchmarking Agentic Broad Info-Seeking Paper
• 2508.07999
• Published Aug 11, 2025 • 111
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm
Bridging Foundation Models and Lifelong Agentic Systems Paper
• 2508.07407
• Published Aug 10, 2025 • 99
BrowseComp-Plus: A More Fair and Transparent Evaluation Benchmark of
Deep-Research Agent Paper
• 2508.06600
• Published Aug 8, 2025 • 42
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent Paper
• 2508.05748
• Published Aug 7, 2025 • 142
Beyond Ten Turns: Unlocking Long-Horizon Agentic Search with Large-Scale
Asynchronous RL Paper
• 2508.07976
• Published Aug 11, 2025 • 52
OpenCUA: Open Foundations for Computer-Use Agents Paper
• 2508.09123
• Published Aug 12, 2025 • 33
Seeing, Listening, Remembering, and Reasoning: A Multimodal Agent with
Long-Term Memory Paper
• 2508.09736
• Published Aug 13, 2025 • 58
AWorld: Dynamic Multi-Agent System with Stable Maneuvering for Robust
GAIA Problem Solving Paper
• 2508.09889
• Published Aug 13, 2025 • 32
UI-Venus Technical Report: Building High-performance UI Agents with RFT Paper
• 2508.10833
• Published Aug 14, 2025 • 45
Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent
Distillation and Agentic RL Paper
• 2508.13167
• Published Aug 6, 2025 • 129
MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents Paper
• 2508.13186
• Published Aug 14, 2025 • 20
CAMAR: Continuous Actions Multi-Agent Routing Paper
• 2508.12845
• Published Aug 18, 2025 • 7
Atom-Searcher: Enhancing Agentic Deep Research via Fine-Grained Atomic
Thought Reward Paper
• 2508.12800
• Published Aug 18, 2025 • 6
MCP-Universe: Benchmarking Large Language Models with Real-World Model
Context Protocol Servers Paper
• 2508.14704
• Published Aug 20, 2025 • 43
Mobile-Agent-v3: Foundamental Agents for GUI Automation Paper
• 2508.15144
• Published Aug 21, 2025 • 65
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs Paper
• 2508.16153
• Published Aug 22, 2025 • 162
PosterGen: Aesthetic-Aware Paper-to-Poster Generation via Multi-Agent
LLMs Paper
• 2508.17188
• Published Aug 24, 2025 • 17
Training Language Model Agents to Find Vulnerabilities with CTF-Dojo Paper
• 2508.18370
• Published Aug 25, 2025 • 3
ReportBench: Evaluating Deep Research Agents via Academic Survey Tasks Paper
• 2508.15804
• Published Aug 14, 2025 • 15
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World
Tasks via MCP Servers Paper
• 2508.20453
• Published Aug 28, 2025 • 63
AWorld: Orchestrating the Training Recipe for Agentic AI Paper
• 2508.20404
• Published Aug 28, 2025 • 38
UItron: Foundational GUI Agent with Advanced Perception and Planning Paper
• 2508.21767
• Published Aug 29, 2025 • 12
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey Paper
• 2509.02547
• Published Sep 2, 2025 • 238
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn
Reinforcement Learning Paper
• 2509.02544
• Published Sep 2, 2025 • 127
AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making
through Multi-Turn Reinforcement Learning Paper
• 2509.08755
• Published Sep 10, 2025 • 57
MCP-AgentBench: Evaluating Real-World Language Agent Performance with
MCP-Mediated Tools Paper
• 2509.09734
• Published Sep 10, 2025 • 16
QuantAgent: Price-Driven Multi-Agent LLMs for High-Frequency Trading Paper
• 2509.09995
• Published Sep 12, 2025 • 16
WebWeaver: Structuring Web-Scale Evidence with Dynamic Outlines for
Open-Ended Deep Research Paper
• 2509.13312
• Published Sep 16, 2025 • 106
Scaling Agents via Continual Pre-training Paper
• 2509.13310
• Published Sep 16, 2025 • 117
WebSailor-V2: Bridging the Chasm to Proprietary Agents via Synthetic
Data and Scalable Reinforcement Learning Paper
• 2509.13305
• Published Sep 16, 2025 • 91
Towards General Agentic Intelligence via Environment Scaling Paper
• 2509.13311
• Published Sep 16, 2025 • 72
WebResearcher: Unleashing unbounded reasoning capability in Long-Horizon
Agents Paper
• 2509.13309
• Published Sep 16, 2025 • 67
ReSum: Unlocking Long-Horizon Search Intelligence via Context
Summarization Paper
• 2509.13313
• Published Sep 16, 2025 • 80
ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform
Data Paper
• 2509.15221
• Published Sep 18, 2025 • 111
Towards Human-like Multimodal Conversational Agent by Generating
Engaging Speech Paper
• 2509.14627
• Published Sep 18, 2025 • 1
LIMI: Less is More for Agency Paper
• 2509.17567
• Published Sep 22, 2025 • 104
ARE: Scaling Up Agent Environments and Evaluations Paper
• 2509.17158
• Published Sep 21, 2025 • 36
SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering
Tasks? Paper
• 2509.16941
• Published Sep 21, 2025 • 21
Paper
• 2509.17336
• Published Sep 22, 2025 • 10
GEM: A Gym for Agentic LLMs Paper
• 2510.01051
• Published Oct 1, 2025 • 91
Flash-Searcher: Fast and Effective Web Agents via DAG-Based Parallel
Execution Paper
• 2509.25301
• Published Sep 29, 2025 • 20
JoyAgent-JDGenie: Technical Report on the GAIA Paper
• 2510.00510
• Published Oct 1, 2025 • 4
Multi-Agent Tool-Integrated Policy Optimization Paper
• 2510.04678
• Published Oct 6, 2025 • 31
Don't Just Fine-tune the Agent, Tune the Environment Paper
• 2510.10197
• Published Oct 11, 2025 • 30
AlphaQuanter: An End-to-End Tool-Orchestrated Agentic Reinforcement
Learning Framework for Stock Trading Paper
• 2510.14264
• Published Oct 16, 2025 • 10
DeepAnalyze: Agentic Large Language Models for Autonomous Data Science Paper
• 2510.16872
• Published Oct 19, 2025 • 112
DeepAgent: A General Reasoning Agent with Scalable Toolsets Paper
• 2510.21618
• Published Oct 24, 2025 • 103
Tongyi DeepResearch Technical Report Paper
• 2510.24701
• Published Oct 28, 2025 • 103
Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds Paper
• 2511.08892
• Published Nov 12, 2025 • 214
HaluMem: Evaluating Hallucinations in Memory Systems of Agents Paper
• 2511.03506
• Published Nov 5, 2025 • 95
UniVA: Universal Video Agent towards Open-Source Next-Generation Video Generalist Paper
• 2511.08521
• Published Nov 11, 2025 • 39
Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning Paper
• 2511.14460
• Published Nov 18, 2025 • 21
General Agentic Memory Via Deep Research Paper
• 2511.18423
• Published Nov 23, 2025 • 170
Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO Paper
• 2511.13288
• Published Nov 17, 2025 • 19
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence Paper
• 2511.18538
• Published Nov 23, 2025 • 303
Deep Research: A Systematic Survey Paper
• 2512.02038
• Published Nov 24, 2025 • 73
PosterCopilot: Toward Layout Reasoning and Controllable Editing for Professional Graphic Design Paper
• 2512.04082
• Published Dec 3, 2025 • 14
Step-GUI Technical Report Paper
• 2512.15431
• Published Dec 17, 2025 • 133
Memory in the Age of AI Agents Paper
• 2512.13564
• Published Dec 15, 2025 • 157
Paper
• 2512.16301
• Published Dec 18, 2025 • 108
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models Paper
• 2512.24618
• Published Dec 31, 2025 • 154
MAI-UI Technical Report: Real-World Centric Foundation GUI Agents Paper
• 2512.22047
• Published Dec 26, 2025 • 30
Youtu-Agent: Scaling Agent Productivity with Automated Generation and Hybrid Policy Optimization Paper
• 2512.24615
• Published Dec 31, 2025 • 119
Agentic Reasoning for Large Language Models Paper
• 2601.12538
• Published Jan 18 • 202
Kimi K2.5: Visual Agentic Intelligence Paper
• 2602.02276
• Published Feb 2 • 263