Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Paper page - WebThinker: Empowering Large Reasoning Models with Deep Research Capability
[go: Go Back, main page]

\n\t\t\n\t\n\t\n\t\tIntroduction\n\t\n\n

We propose WebThinker, a deep research agent that empowers LRMs to autonomously search the web, navigate web pages, and draft research reports during the reasoning process. WebThinker integrates a Deep Web Explorer module, enabling LRMs to dynamically search, navigate, and extract information from the web when encountering knowledge gaps. It also employs an Autonomous Think-Search-and-Draft strategy, allowing the model to seamlessly interleave reasoning, information gathering, and report writing in real time. To further enhance research tool utilization, we introduce an RL-based training strategy via iterative online Direct Preference Optimization (DPO). Extensive experiments on complex reasoning benchmarks (GPQA, GAIA, WebWalkerQA, HLE) and scientific report generation tasks (Glaive) demonstrate that WebThinker significantly outperforms existing methods and strong proprietary systems. Our approach enhances LRM reliability and applicability in complex scenarios, paving the way for more capable and versatile deep research systems.

\n

\n\t\n\t\t\n\t\n\t\n\t\tOur Github Repo:https://github.com/RUC-NLPIR/WebThinker?tab=readme-ov-file\n\t\n

\n

\n\t\n\t\t\n\t\n\t\n\t\tDemo:\n\t\n

\n

\n\n

\n\t\n\t\t\n\t\n\t\n\t\tMain Result Overview:\n\t\n

\n

\"image.png\"

\n

\n\t\n\t\t\n\t\n\t\n\t\tOur WebThinker Framework:\n\t\n

\n

\"image.png\"

\n","updatedAt":"2025-05-01T12:52:56.572Z","author":{"_id":"61cd4b833dd34ba1985e0753","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/61cd4b833dd34ba1985e0753/BfHfrwotoMESpXZOHiIe4.png","fullname":"KABI","name":"dongguanting","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":64,"isUserFollowing":false}},"numEdits":2,"identifiedLanguage":{"language":"en","probability":0.8219174742698669},"editors":["dongguanting"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/61cd4b833dd34ba1985e0753/BfHfrwotoMESpXZOHiIe4.png"],"reactions":[{"reaction":"🔥","users":["dongguanting","nitinsurya"],"count":2}],"isReport":false}},{"id":"68142140ea47207be3196adb","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false},"createdAt":"2025-05-02T01:34:56.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is an automated message from the [Librarian Bot](https://huggingface.co/librarian-bots). I found the following papers similar to this paper. \n\nThe following papers were recommended by the Semantic Scholar API \n\n* [ReaRAG: Knowledge-guided Reasoning Enhances Factuality of Large Reasoning Models with Iterative Retrieval Augmented Generation](https://huggingface.co/papers/2503.21729) (2025)\n* [R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning](https://huggingface.co/papers/2503.05592) (2025)\n* [DeepResearcher: Scaling Deep Research via Reinforcement Learning in Real-world Environments](https://huggingface.co/papers/2504.03160) (2025)\n* [Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning](https://huggingface.co/papers/2503.09516) (2025)\n* [ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning](https://huggingface.co/papers/2503.19470) (2025)\n* [MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search](https://huggingface.co/papers/2503.20757) (2025)\n* [ReTool: Reinforcement Learning for Strategic Tool Use in LLMs](https://huggingface.co/papers/2504.11536) (2025)\n\n\n Please give a thumbs up to this comment if you found it helpful!\n\n If you want recommendations for any Paper on Hugging Face checkout [this](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers) Space\n\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: `@librarian-bot recommend`","html":"

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

\n

The following papers were recommended by the Semantic Scholar API

\n\n

Please give a thumbs up to this comment if you found it helpful!

\n

If you want recommendations for any Paper on Hugging Face checkout this Space

\n

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend

\n","updatedAt":"2025-05-02T01:34:56.404Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7443302869796753},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}},{"id":"68180327f68768edc3b90a25","author":{"_id":"66a5ec10598e97b48580b17c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/pv2ikq47zlNIVYI3vN_qc.png","fullname":"Openteknologies","name":"opentek","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2025-05-05T00:15:35.000Z","type":"comment","data":{"edited":true,"hidden":true,"hiddenBy":"","latest":{"raw":"This comment has been hidden","html":"This comment has been hidden","updatedAt":"2025-05-05T00:16:18.611Z","author":{"_id":"66a5ec10598e97b48580b17c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/pv2ikq47zlNIVYI3vN_qc.png","fullname":"Openteknologies","name":"opentek","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"editors":[],"editorAvatarUrls":[],"reactions":[]}},{"id":"6818d22d2b9e399239f8315f","author":{"_id":"66403c164b2bc635c9ff3e54","avatarUrl":"/avatars/64704b0c0201510f1a821290c10ca485.svg","fullname":"krishnkantbatham","name":"krishnkant07","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2025-05-05T14:58:53.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"\nfind this to internet","html":"

find this to internet

\n","updatedAt":"2025-05-05T14:58:53.513Z","author":{"_id":"66403c164b2bc635c9ff3e54","avatarUrl":"/avatars/64704b0c0201510f1a821290c10ca485.svg","fullname":"krishnkantbatham","name":"krishnkant07","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7966964840888977},"editors":["krishnkant07"],"editorAvatarUrls":["/avatars/64704b0c0201510f1a821290c10ca485.svg"],"reactions":[],"isReport":false}},{"id":"681eff3c98a3c6a8c7a64a56","author":{"_id":"677ed96da1ce745bbeab0f2a","avatarUrl":"/avatars/21e4f403f1529dae1be181a046f13504.svg","fullname":"Dandige Srinivas","name":"Dandige","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2025-05-10T07:24:44.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Hi\n","html":"

Hi

\n","updatedAt":"2025-05-10T07:24:44.193Z","author":{"_id":"677ed96da1ce745bbeab0f2a","avatarUrl":"/avatars/21e4f403f1529dae1be181a046f13504.svg","fullname":"Dandige Srinivas","name":"Dandige","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"de","probability":0.5160623788833618},"editors":["Dandige"],"editorAvatarUrls":["/avatars/21e4f403f1529dae1be181a046f13504.svg"],"reactions":[],"isReport":false}},{"id":"682887db565d6bb7a7833d57","author":{"_id":"647cd8d1f812a0a79092036b","avatarUrl":"/avatars/ee4938d345c5f4b734b4806657345b17.svg","fullname":"EUSTRATIOS FILIPPIDIS","name":"filipp13","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2025-05-17T12:58:03.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"What would be the best paid professions in the next 10 years?\n","html":"

What would be the best paid professions in the next 10 years?

\n","updatedAt":"2025-05-17T12:58:03.006Z","author":{"_id":"647cd8d1f812a0a79092036b","avatarUrl":"/avatars/ee4938d345c5f4b734b4806657345b17.svg","fullname":"EUSTRATIOS FILIPPIDIS","name":"filipp13","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9807289242744446},"editors":["filipp13"],"editorAvatarUrls":["/avatars/ee4938d345c5f4b734b4806657345b17.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2504.21776","authors":[{"_id":"6812d593060494e99e4835e0","user":{"_id":"66e03eace17fb5ff054b7686","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/66e03eace17fb5ff054b7686/PpSV0Qo5lwTyxIZMp57xq.jpeg","isPro":false,"fullname":"Xiaoxi Li","user":"lixiaoxi45","type":"user"},"name":"Xiaoxi Li","status":"claimed_verified","statusLastChangedAt":"2025-05-02T06:35:05.719Z","hidden":false},{"_id":"6812d593060494e99e4835e1","user":{"_id":"6695f14df0ffd8e3a379ad61","avatarUrl":"/avatars/5ebb7e55ee9c2d93850b279f440675b0.svg","isPro":false,"fullname":"Jiajie Jin","user":"jinjiajie","type":"user"},"name":"Jiajie Jin","status":"admin_assigned","statusLastChangedAt":"2025-05-05T07:51:30.352Z","hidden":false},{"_id":"6812d593060494e99e4835e2","user":{"_id":"61cd4b833dd34ba1985e0753","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/61cd4b833dd34ba1985e0753/BfHfrwotoMESpXZOHiIe4.png","isPro":false,"fullname":"KABI","user":"dongguanting","type":"user"},"name":"Guanting Dong","status":"admin_assigned","statusLastChangedAt":"2025-05-05T07:51:43.904Z","hidden":false},{"_id":"6812d593060494e99e4835e3","name":"Hongjin Qian","hidden":false},{"_id":"6812d593060494e99e4835e4","user":{"_id":"625e62452a7279d3c77b5c38","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/625e62452a7279d3c77b5c38/zJINew6U4_Gup4WTobb-0.jpeg","isPro":false,"fullname":"Yutao Zhu","user":"yutaozhu94","type":"user"},"name":"Yutao Zhu","status":"admin_assigned","statusLastChangedAt":"2025-05-05T07:52:05.326Z","hidden":false},{"_id":"6812d593060494e99e4835e5","user":{"_id":"62f3a590261bc5fb2e072a5f","avatarUrl":"/avatars/d65d362ddc32aca3d6c564252d81e109.svg","isPro":false,"fullname":"YongkangWu","user":"wuyongkang","type":"user"},"name":"Yongkang Wu","status":"admin_assigned","statusLastChangedAt":"2025-05-05T07:52:24.526Z","hidden":false},{"_id":"6812d593060494e99e4835e6","user":{"_id":"64b8c89052b7353d8c6a1013","avatarUrl":"/avatars/cd59fffe81f6b07b4519540b8ff3d95f.svg","isPro":false,"fullname":"Ji-Rong Wen","user":"jrwen","type":"user"},"name":"Ji-Rong Wen","status":"admin_assigned","statusLastChangedAt":"2025-05-05T07:52:32.940Z","hidden":false},{"_id":"6812d593060494e99e4835e7","user":{"_id":"66f0bf59e9d50ec57febf751","avatarUrl":"/avatars/be97941e60064e5dd806c6fe9db3c537.svg","isPro":false,"fullname":"Zhicheng Dou","user":"douzc","type":"user"},"name":"Zhicheng Dou","status":"admin_assigned","statusLastChangedAt":"2025-05-05T07:52:39.068Z","hidden":false}],"publishedAt":"2025-04-30T16:25:25.000Z","submittedOnDailyAt":"2025-05-01T00:33:55.498Z","title":"WebThinker: Empowering Large Reasoning Models with Deep Research\n Capability","submittedOnDailyBy":{"_id":"61cd4b833dd34ba1985e0753","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/61cd4b833dd34ba1985e0753/BfHfrwotoMESpXZOHiIe4.png","isPro":false,"fullname":"KABI","user":"dongguanting","type":"user"},"summary":"Large reasoning models (LRMs), such as OpenAI-o1 and DeepSeek-R1, demonstrate\nimpressive long-horizon reasoning capabilities. However, their reliance on\nstatic internal knowledge limits their performance on complex,\nknowledge-intensive tasks and hinders their ability to produce comprehensive\nresearch reports requiring synthesis of diverse web information. To address\nthis, we propose WebThinker, a deep research agent that empowers LRMs\nto autonomously search the web, navigate web pages, and draft research reports\nduring the reasoning process. WebThinker integrates a Deep Web\nExplorer module, enabling LRMs to dynamically search, navigate, and extract\ninformation from the web when encountering knowledge gaps. It also employs an\nAutonomous Think-Search-and-Draft strategy, allowing the model to\nseamlessly interleave reasoning, information gathering, and report writing in\nreal time. To further enhance research tool utilization, we introduce an\nRL-based training strategy via iterative online Direct Preference\nOptimization (DPO). Extensive experiments on complex reasoning benchmarks\n(GPQA, GAIA, WebWalkerQA, HLE) and scientific report generation tasks (Glaive)\ndemonstrate that WebThinker significantly outperforms existing methods and\nstrong proprietary systems. Our approach enhances LRM reliability and\napplicability in complex scenarios, paving the way for more capable and\nversatile deep research systems. The code is available at\nhttps://github.com/RUC-NLPIR/WebThinker.","upvotes":59,"discussionId":"6812d594060494e99e48361c","projectPage":"https://foremost-beechnut-8ed.notion.site/WebThinker-Empowering-Large-Reasoning-Models-with-Deep-Research-Capability-d13158a27d924a4b9df7f9ab94066b64","githubRepo":"https://github.com/RUC-NLPIR/WebThinker","githubRepoAddedBy":"user","ai_summary":"WebThinker, an autonomous deep research agent, improves LRM performance on complex tasks by integrating a Deep Web Explorer for dynamic web information gathering and using RL-based training through Direct Preference Optimization.","ai_keywords":["Large reasoning models (LRMs)","WebThinker","Deep Web Explorer","Autonomous Think-Search-and-Draft","RL-based training","Direct Preference Optimization (DPO)","GPQA","GAIA","WebWalkerQA","HLE","Glaive"],"githubStars":1405},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"61cd4b833dd34ba1985e0753","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/61cd4b833dd34ba1985e0753/BfHfrwotoMESpXZOHiIe4.png","isPro":false,"fullname":"KABI","user":"dongguanting","type":"user"},{"_id":"66e03eace17fb5ff054b7686","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/66e03eace17fb5ff054b7686/PpSV0Qo5lwTyxIZMp57xq.jpeg","isPro":false,"fullname":"Xiaoxi Li","user":"lixiaoxi45","type":"user"},{"_id":"620783f24e28382272337ba4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/620783f24e28382272337ba4/zkUveQPNiDfYjgGhuFErj.jpeg","isPro":false,"fullname":"GuoLiangTang","user":"Tommy930","type":"user"},{"_id":"6621ec2524eb2673fe0790fc","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6621ec2524eb2673fe0790fc/cooTXi12eRWFiSSIj_nA-.jpeg","isPro":false,"fullname":"Ania Forge","user":"zhangboguodong","type":"user"},{"_id":"654c99d6e82a71cb487c2ecd","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/654c99d6e82a71cb487c2ecd/hiMMOyh-3bAUaqnBM5yT4.jpeg","isPro":false,"fullname":"ChenlongDeng","user":"ChenlongDeng","type":"user"},{"_id":"6695f14df0ffd8e3a379ad61","avatarUrl":"/avatars/5ebb7e55ee9c2d93850b279f440675b0.svg","isPro":false,"fullname":"Jiajie Jin","user":"jinjiajie","type":"user"},{"_id":"64d068a231c655ff8a77153e","avatarUrl":"/avatars/2b7407be92b65d435fecc3c29e7f8455.svg","isPro":false,"fullname":"wenhan liu","user":"liuwenhan","type":"user"},{"_id":"649a65605c74a2125c22bbc1","avatarUrl":"/avatars/e7435d3aeeb59acc6f6f43b48d6982a0.svg","isPro":false,"fullname":"Mao","user":"kyriemao","type":"user"},{"_id":"657152eb12f162153b50ec9d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/657152eb12f162153b50ec9d/qnldHP35PclV0pDz_05q8.jpeg","isPro":false,"fullname":"Byung-Kwan Lee","user":"BK-Lee","type":"user"},{"_id":"65db23d1f386d08eb0d1cec5","avatarUrl":"/avatars/b495ec5b35b15fea245ef490b83d1856.svg","isPro":false,"fullname":"Mengjie Deng","user":"MengjieDeng","type":"user"},{"_id":"67809011c5273cefdd3c7dcf","avatarUrl":"/avatars/b7e8fa0705d7c3f0728466a78bede2aa.svg","isPro":false,"fullname":"Ethan Jorvik","user":"EthanJorvik","type":"user"},{"_id":"680a3638c307666be4d93465","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/LevtFpNsCOEnWbW6TPenS.png","isPro":false,"fullname":"Peitian Zhang","user":"zzzzzpppt","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0}">
Papers
arxiv:2504.21776

WebThinker: Empowering Large Reasoning Models with Deep Research Capability

Published on Apr 30, 2025
· Submitted by
KABI
on May 1, 2025

Abstract

WebThinker, an autonomous deep research agent, improves LRM performance on complex tasks by integrating a Deep Web Explorer for dynamic web information gathering and using RL-based training through Direct Preference Optimization.

AI-generated summary

Large reasoning models (LRMs), such as OpenAI-o1 and DeepSeek-R1, demonstrate impressive long-horizon reasoning capabilities. However, their reliance on static internal knowledge limits their performance on complex, knowledge-intensive tasks and hinders their ability to produce comprehensive research reports requiring synthesis of diverse web information. To address this, we propose WebThinker, a deep research agent that empowers LRMs to autonomously search the web, navigate web pages, and draft research reports during the reasoning process. WebThinker integrates a Deep Web Explorer module, enabling LRMs to dynamically search, navigate, and extract information from the web when encountering knowledge gaps. It also employs an Autonomous Think-Search-and-Draft strategy, allowing the model to seamlessly interleave reasoning, information gathering, and report writing in real time. To further enhance research tool utilization, we introduce an RL-based training strategy via iterative online Direct Preference Optimization (DPO). Extensive experiments on complex reasoning benchmarks (GPQA, GAIA, WebWalkerQA, HLE) and scientific report generation tasks (Glaive) demonstrate that WebThinker significantly outperforms existing methods and strong proprietary systems. Our approach enhances LRM reliability and applicability in complex scenarios, paving the way for more capable and versatile deep research systems. The code is available at https://github.com/RUC-NLPIR/WebThinker.

Community

Paper author Paper submitter
edited May 1, 2025

Introduction

We propose WebThinker, a deep research agent that empowers LRMs to autonomously search the web, navigate web pages, and draft research reports during the reasoning process. WebThinker integrates a Deep Web Explorer module, enabling LRMs to dynamically search, navigate, and extract information from the web when encountering knowledge gaps. It also employs an Autonomous Think-Search-and-Draft strategy, allowing the model to seamlessly interleave reasoning, information gathering, and report writing in real time. To further enhance research tool utilization, we introduce an RL-based training strategy via iterative online Direct Preference Optimization (DPO). Extensive experiments on complex reasoning benchmarks (GPQA, GAIA, WebWalkerQA, HLE) and scientific report generation tasks (Glaive) demonstrate that WebThinker significantly outperforms existing methods and strong proprietary systems. Our approach enhances LRM reliability and applicability in complex scenarios, paving the way for more capable and versatile deep research systems.

Our Github Repo:https://github.com/RUC-NLPIR/WebThinker?tab=readme-ov-file

Demo:

Main Result Overview:

image.png

Our WebThinker Framework:

image.png

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

This comment has been hidden

find this to internet

What would be the best paid professions in the next 10 years?

Sign up or log in to comment

Models citing this paper 4

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2504.21776 in a Space README.md to link it from this page.

Collections including this paper 17