pinned
Running
Agents
26
Online-Mind2Web Leaderboard
🌐
View and explore Mind2Web agent evaluation leaderboards
Natural language processing, language models, language agents
AgentCL: Toward Rigorous Evaluation of Continual Learning in Language Agents
QUEST: Training Frontier Deep Research Agents with Fully Synthetic Tasks
View and explore Mind2Web agent evaluation leaderboards
Generate comprehensive answers via multi‑source web research
Display and submit travel planner evaluation results
Plan a travel itinerary with cost tracking