Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Paper page - ANCHOR: Branch-Point Data Generation for GUI Agents
[go: Go Back, main page]

https://huggingface.co/collections/yale-nlp/anchor

\n","updatedAt":"2026-02-11T02:59:56.080Z","author":{"_id":"683c642b02c1a474a867964e","avatarUrl":"/avatars/63e44a9cf788ee7b3ad236407700ceca.svg","fullname":"Jinbiao Wei","name":"mikeweii","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.6618953943252563},"editors":["mikeweii"],"editorAvatarUrls":["/avatars/63e44a9cf788ee7b3ad236407700ceca.svg"],"reactions":[],"isReport":false}},{"id":"698d2febc11495aa3d93a9b0","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":317,"isUserFollowing":false},"createdAt":"2026-02-12T01:42:03.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is an automated message from the [Librarian Bot](https://huggingface.co/librarian-bots). I found the following papers similar to this paper. \n\nThe following papers were recommended by the Semantic Scholar API \n\n* [OmegaUse: Building a General-Purpose GUI Agent for Autonomous Task Execution](https://huggingface.co/papers/2601.20380) (2026)\n* [ShowUI-Aloha: Human-Taught GUI Agent](https://huggingface.co/papers/2601.07181) (2026)\n* [Learning with Challenges: Adaptive Difficulty-Aware Data Generation for Mobile GUI Agent Training](https://huggingface.co/papers/2601.22781) (2026)\n* [MagicGUI-RMS: A Multi-Agent Reward Model System for Self-Evolving GUI Agents via Automated Feedback Reflux](https://huggingface.co/papers/2601.13060) (2026)\n* [MAI-UI Technical Report: Real-World Centric Foundation GUI Agents](https://huggingface.co/papers/2512.22047) (2025)\n* [OS-Oracle: A Comprehensive Framework for Cross-Platform GUI Critic Models](https://huggingface.co/papers/2512.16295) (2025)\n* [Trajectory2Task: Training Robust Tool-Calling Agents with Synthesized Yet Verifiable Data for Complex User Intents](https://huggingface.co/papers/2601.20144) (2026)\n\n\n Please give a thumbs up to this comment if you found it helpful!\n\n If you want recommendations for any Paper on Hugging Face checkout [this](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers) Space\n\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: `@librarian-bot recommend`","html":"

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

\n

The following papers were recommended by the Semantic Scholar API

\n\n

Please give a thumbs up to this comment if you found it helpful!

\n

If you want recommendations for any Paper on Hugging Face checkout this Space

\n

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend

\n","updatedAt":"2026-02-12T01:42:03.994Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":317,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7098899483680725},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2602.07153","authors":[{"_id":"698abd921b2dc6b37d61b120","user":{"_id":"683c642b02c1a474a867964e","avatarUrl":"/avatars/63e44a9cf788ee7b3ad236407700ceca.svg","isPro":false,"fullname":"Jinbiao Wei","user":"mikeweii","type":"user"},"name":"Jinbiao Wei","status":"admin_assigned","statusLastChangedAt":"2026-02-10T19:44:23.938Z","hidden":false},{"_id":"698abd921b2dc6b37d61b121","name":"Yilun Zhao","hidden":false},{"_id":"698abd921b2dc6b37d61b122","name":"Kangqi Ni","hidden":false},{"_id":"698abd921b2dc6b37d61b123","name":"Arman Cohan","hidden":false}],"publishedAt":"2026-02-06T19:55:26.000Z","submittedOnDailyAt":"2026-02-11T00:20:18.700Z","title":"ANCHOR: Branch-Point Data Generation for GUI Agents","submittedOnDailyBy":{"_id":"683c642b02c1a474a867964e","avatarUrl":"/avatars/63e44a9cf788ee7b3ad236407700ceca.svg","isPro":false,"fullname":"Jinbiao Wei","user":"mikeweii","type":"user"},"summary":"End-to-end GUI agents for real desktop environments require large amounts of high-quality interaction data, yet collecting human demonstrations is expensive and existing synthetic pipelines often suffer from limited task diversity or noisy, goal-drifting trajectories. We present a trajectory expansion framework Anchor that bootstraps scalable desktop supervision from a small set of verified seed demonstrations. Starting from each seed, we identify branch points that correspond to meaningful state changes and propose new, state-grounded task variants conditioned on the current GUI context. An executing agent then follows the proposed instructions to generate new trajectories, while a verifier enforces task completion via state-aware checks and trajectory-level consistency. To improve supervision quality, we further apply task-conditioned step-level filtering to remove ungrounded actions and denoise post-branch segments to maintain coherent intent. Experiments on standard desktop benchmarks, OSWorld and WindowsAgentArena, show that models fine-tuned on our expanded corpus achieve consistent improvements over zero-shot agents and representative synthesis baselines, and generalize across applications and operating systems.","upvotes":5,"discussionId":"698abd921b2dc6b37d61b124","ai_summary":"A trajectory expansion framework called Anchor bootstraps scalable desktop supervision from seed demonstrations by identifying branch points and generating new trajectories through state-grounded task variants.","ai_keywords":["trajectory expansion","GUI agents","desktop environments","seed demonstrations","branch points","state-grounded task variants","trajectory generation","verifier","task completion","step-level filtering","denoising"],"organization":{"_id":"6532df27d690f3012efde84c","name":"yale-nlp","fullname":"Yale NLP Lab","avatar":"https://cdn-uploads.huggingface.co/production/uploads/65204db5b0e0d57453cb1809/9OAeiZ-BrN2g1h1yd6-1W.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"683c642b02c1a474a867964e","avatarUrl":"/avatars/63e44a9cf788ee7b3ad236407700ceca.svg","isPro":false,"fullname":"Jinbiao Wei","user":"mikeweii","type":"user"},{"_id":"62f662bcc58915315c4eccea","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62f662bcc58915315c4eccea/zOAQLONfMP88zr70sxHK-.jpeg","isPro":true,"fullname":"Yilun Zhao","user":"yilunzhao","type":"user"},{"_id":"6461226c933afb0106a9cd46","avatarUrl":"/avatars/5b0f20526446713aba938e1a05ad5814.svg","isPro":false,"fullname":"Kangqi (Kevin) Ni","user":"kangqi-ni","type":"user"},{"_id":"65606327d1fe213e3ebc1ec7","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/aPxRasWVtnNqDVg60uAcL.jpeg","isPro":true,"fullname":"Snorf Yang","user":"snorfyang","type":"user"},{"_id":"620783f24e28382272337ba4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/620783f24e28382272337ba4/zkUveQPNiDfYjgGhuFErj.jpeg","isPro":false,"fullname":"GuoLiangTang","user":"Tommy930","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0,"organization":{"_id":"6532df27d690f3012efde84c","name":"yale-nlp","fullname":"Yale NLP Lab","avatar":"https://cdn-uploads.huggingface.co/production/uploads/65204db5b0e0d57453cb1809/9OAeiZ-BrN2g1h1yd6-1W.png"}}">
Papers
arxiv:2602.07153

ANCHOR: Branch-Point Data Generation for GUI Agents

Published on Feb 6
· Submitted by
Jinbiao Wei
on Feb 11
Authors:
,
,

Abstract

A trajectory expansion framework called Anchor bootstraps scalable desktop supervision from seed demonstrations by identifying branch points and generating new trajectories through state-grounded task variants.

AI-generated summary

End-to-end GUI agents for real desktop environments require large amounts of high-quality interaction data, yet collecting human demonstrations is expensive and existing synthetic pipelines often suffer from limited task diversity or noisy, goal-drifting trajectories. We present a trajectory expansion framework Anchor that bootstraps scalable desktop supervision from a small set of verified seed demonstrations. Starting from each seed, we identify branch points that correspond to meaningful state changes and propose new, state-grounded task variants conditioned on the current GUI context. An executing agent then follows the proposed instructions to generate new trajectories, while a verifier enforces task completion via state-aware checks and trajectory-level consistency. To improve supervision quality, we further apply task-conditioned step-level filtering to remove ungrounded actions and denoise post-branch segments to maintain coherent intent. Experiments on standard desktop benchmarks, OSWorld and WindowsAgentArena, show that models fine-tuned on our expanded corpus achieve consistent improvements over zero-shot agents and representative synthesis baselines, and generalize across applications and operating systems.

Community

Paper author Paper submitter

End-to-end GUI agents for real desktop environments require large amounts of high-quality interaction data, yet collecting human demonstrations is expensive and existing synthetic pipelines often suffer from limited task diversity or noisy, goal-drifting trajectories. We present a trajectory expansion framework Anchor that bootstraps scalable desktop supervision from a small set of verified seed demonstrations. Starting from each seed, we identify branch points that correspond to meaningful state changes and propose new, state-grounded task variants conditioned on the current GUI context. An executing agent then follows the proposed instructions to generate new trajectories, while a verifier enforces task completion via state-aware checks and trajectory-level consistency. To improve supervision quality, we further apply task-conditioned step-level filtering to remove ungrounded actions and denoise post-branch segments to maintain coherent intent. Experiments on standard desktop benchmarks, OSWorld and WindowsAgentArena, show that models fine-tuned on our expanded corpus achieve consistent improvements over zero-shot agents and representative synthesis baselines, and generalize across applications and operating systems.

Paper author Paper submitter

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2602.07153 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2602.07153 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2602.07153 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.