Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456 Paper page - Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs
\n","updatedAt":"2024-12-31T13:54:02.211Z","author":{"_id":"60f1abe7544c2adfd699860c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674929746905-60f1abe7544c2adfd699860c.jpeg","fullname":"AK","name":"akhaliq","type":"user","isPro":false,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":9177,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.3896268904209137},"editors":["akhaliq"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674929746905-60f1abe7544c2adfd699860c.jpeg"],"reactions":[],"isReport":false}},{"id":"67749b77ca3ee7a4d16658ce","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false},"createdAt":"2025-01-01T01:33:43.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is an automated message from the [Librarian Bot](https://huggingface.co/librarian-bots). I found the following papers similar to this paper. \n\nThe following papers were recommended by the Semantic Scholar API \n\n* [Beyond Examples: High-level Automated Reasoning Paradigm in In-Context Learning via MCTS](https://huggingface.co/papers/2411.18478) (2024)\n* [O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson?](https://huggingface.co/papers/2411.16489) (2024)\n* [Hint Marginalization for Improved Reasoning in Large Language Models](https://huggingface.co/papers/2412.13292) (2024)\n* [AtomThink: A Slow Thinking Framework for Multimodal Mathematical Reasoning](https://huggingface.co/papers/2411.11930) (2024)\n* [Patience Is The Key to Large Language Model Reasoning](https://huggingface.co/papers/2411.13082) (2024)\n* [Forest-of-Thought: Scaling Test-Time Compute for Enhancing LLM Reasoning](https://huggingface.co/papers/2412.09078) (2024)\n* [C3oT: Generating Shorter Chain-of-Thought without Compromising Effectiveness](https://huggingface.co/papers/2412.11664) (2024)\n\n\n Please give a thumbs up to this comment if you found it helpful!\n\n If you want recommendations for any Paper on Hugging Face checkout [this](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers) Space\n\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: `@librarian-bot recommend`","html":"
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
\n
The following papers were recommended by the Semantic Scholar API
Please give a thumbs up to this comment if you found it helpful!
\n
If you want recommendations for any Paper on Hugging Face checkout this Space
\n
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend
\n","updatedAt":"2025-01-01T01:33:43.316Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7181700468063354},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2412.21187","authors":[{"_id":"6773f75a23a7829936cb36bd","name":"Xingyu Chen","hidden":false},{"_id":"6773f75a23a7829936cb36be","name":"Jiahao Xu","hidden":false},{"_id":"6773f75a23a7829936cb36bf","user":{"_id":"643645baaa4211ef553f613c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/643645baaa4211ef553f613c/svUVeTqwLf5ZurprdTOUC.jpeg","isPro":false,"fullname":"TimLeung","user":"skytliang","type":"user"},"name":"Tian Liang","status":"claimed_verified","statusLastChangedAt":"2026-02-10T09:30:09.041Z","hidden":false},{"_id":"6773f75a23a7829936cb36c0","user":{"_id":"638439ca834d3558a398d035","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1669609868550-noauth.png","isPro":false,"fullname":"Zhiwei He","user":"zwhe99","type":"user"},"name":"Zhiwei He","status":"claimed_verified","statusLastChangedAt":"2025-01-01T20:12:33.739Z","hidden":false},{"_id":"6773f75a23a7829936cb36c1","name":"Jianhui Pang","hidden":false},{"_id":"6773f75a23a7829936cb36c2","user":{"_id":"62d58fd53bf5e059f7cc3245","avatarUrl":"/avatars/7a4f3ee4a37245f67efd26749d66a706.svg","isPro":false,"fullname":"Dian Yu","user":"yudian","type":"user"},"name":"Dian Yu","status":"claimed_verified","statusLastChangedAt":"2025-03-12T20:46:22.967Z","hidden":false},{"_id":"6773f75a23a7829936cb36c3","user":{"_id":"64c94eddcb2f1bf0e7db5a4d","avatarUrl":"/avatars/f7e2532d3c85d5e5b5a02c579ea68c3a.svg","isPro":false,"fullname":"Linfeng Song","user":"freesunshine0316","type":"user"},"name":"Linfeng Song","status":"claimed_verified","statusLastChangedAt":"2025-07-11T08:21:10.898Z","hidden":false},{"_id":"6773f75a23a7829936cb36c4","name":"Qiuzhi Liu","hidden":false},{"_id":"6773f75a23a7829936cb36c5","name":"Mengfei Zhou","hidden":false},{"_id":"6773f75a23a7829936cb36c6","name":"Zhuosheng Zhang","hidden":false},{"_id":"6773f75a23a7829936cb36c7","name":"Rui Wang","hidden":false},{"_id":"6773f75a23a7829936cb36c8","user":{"_id":"67485743561b1e6f9579389f","avatarUrl":"/avatars/8a4cc63bd7be388010bc329bb74582a1.svg","isPro":false,"fullname":"Zhaopeng Tu","user":"zptu","type":"user"},"name":"Zhaopeng Tu","status":"claimed_verified","statusLastChangedAt":"2025-05-09T07:23:20.674Z","hidden":false},{"_id":"6773f75a23a7829936cb36c9","name":"Haitao Mi","hidden":false},{"_id":"6773f75a23a7829936cb36ca","name":"Dong Yu","hidden":false}],"publishedAt":"2024-12-30T18:55:12.000Z","submittedOnDailyAt":"2024-12-31T11:24:02.196Z","title":"Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs","submittedOnDailyBy":{"_id":"60f1abe7544c2adfd699860c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674929746905-60f1abe7544c2adfd699860c.jpeg","isPro":false,"fullname":"AK","user":"akhaliq","type":"user"},"summary":"The remarkable performance of models like the OpenAI o1 can be attributed to\ntheir ability to emulate human-like long-time thinking during inference. These\nmodels employ extended chain-of-thought (CoT) processes, exploring multiple\nstrategies to enhance problem-solving capabilities. However, a critical\nquestion remains: How to intelligently and efficiently scale computational\nresources during testing. This paper presents the first comprehensive study on\nthe prevalent issue of overthinking in these models, where excessive\ncomputational resources are allocated for simple problems with minimal benefit.\nWe introduce novel efficiency metrics from both outcome and process\nperspectives to evaluate the rational use of computational resources by o1-like\nmodels. Using a self-training paradigm, we propose strategies to mitigate\noverthinking, streamlining reasoning processes without compromising accuracy.\nExperimental results show that our approach successfully reduces computational\noverhead while preserving model performance across a range of testsets with\nvarying difficulty levels, such as GSM8K, MATH500, GPQA, and AIME.","upvotes":40,"discussionId":"6773f75b23a7829936cb3729","ai_summary":"The paper addresses overthinking in models like OpenAI o1 by introducing efficiency metrics and strategies to reduce computational overhead without sacrificing accuracy.","ai_keywords":["chain-of-thought","overthinking","self-training paradigm","computational overhead","accuracy"]},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"643b19f8a856622f978df30f","avatarUrl":"/avatars/c82779fdf94f80cdb5020504f83c818b.svg","isPro":false,"fullname":"Yatharth Sharma","user":"YaTharThShaRma999","type":"user"},{"_id":"62d58fd53bf5e059f7cc3245","avatarUrl":"/avatars/7a4f3ee4a37245f67efd26749d66a706.svg","isPro":false,"fullname":"Dian Yu","user":"yudian","type":"user"},{"_id":"642e3f232f6dbab775813012","avatarUrl":"/avatars/257ea224825723c0628a80c995c06c3d.svg","isPro":false,"fullname":"diego porto","user":"dj1982","type":"user"},{"_id":"648c9605565e3a44f3c9bb7b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/648c9605565e3a44f3c9bb7b/W5chvk17Zol6-2QSWkFVR.jpeg","isPro":true,"fullname":"Orr Zohar","user":"orrzohar","type":"user"},{"_id":"620783f24e28382272337ba4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/620783f24e28382272337ba4/zkUveQPNiDfYjgGhuFErj.jpeg","isPro":false,"fullname":"GuoLiangTang","user":"Tommy930","type":"user"},{"_id":"64d4615cf8082bf19b916492","avatarUrl":"/avatars/8e1b59565ec5e4b31090cf1b911781b9.svg","isPro":false,"fullname":"wongyukim","user":"wongyukim","type":"user"},{"_id":"62acad9bc35bb36ff09073ca","avatarUrl":"/avatars/e5493dd37ddc875192ca3e5e5c3d9ab7.svg","isPro":false,"fullname":"Alireza Farzaneh","user":"AlirezaF138","type":"user"},{"_id":"6538119803519fddb4a17e10","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6538119803519fddb4a17e10/ffJMkdx-rM7VvLTCM6ri_.jpeg","isPro":false,"fullname":"samusenps","user":"samusenps","type":"user"},{"_id":"663ccbff3a74a20189d4aa2e","avatarUrl":"/avatars/83a54455e0157480f65c498cd9057cf2.svg","isPro":false,"fullname":"Nguyen Van Thanh","user":"NguyenVanThanhHust","type":"user"},{"_id":"600f7231b386e3ec60bc7224","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1611624993098-noauth.jpeg","isPro":false,"fullname":"quincyqiang","user":"quincyqiang","type":"user"},{"_id":"6374ff24cc5cc31768847b8c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6374ff24cc5cc31768847b8c/7_hKYVzSE226qzB5wJ6gw.jpeg","isPro":false,"fullname":"Minghui Jia","user":"Maxwell-Jia","type":"user"},{"_id":"630c7a338e3ff0c72325fc5f","avatarUrl":"/avatars/b4a41dfd9f8f938ba9272f5c2447e87d.svg","isPro":false,"fullname":"Filip Dimitrovski","user":"fikisipi","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0}">
The paper addresses overthinking in models like OpenAI o1 by introducing efficiency metrics and strategies to reduce computational overhead without sacrificing accuracy.
AI-generated summary
The remarkable performance of models like the OpenAI o1 can be attributed to
their ability to emulate human-like long-time thinking during inference. These
models employ extended chain-of-thought (CoT) processes, exploring multiple
strategies to enhance problem-solving capabilities. However, a critical
question remains: How to intelligently and efficiently scale computational
resources during testing. This paper presents the first comprehensive study on
the prevalent issue of overthinking in these models, where excessive
computational resources are allocated for simple problems with minimal benefit.
We introduce novel efficiency metrics from both outcome and process
perspectives to evaluate the rational use of computational resources by o1-like
models. Using a self-training paradigm, we propose strategies to mitigate
overthinking, streamlining reasoning processes without compromising accuracy.
Experimental results show that our approach successfully reduces computational
overhead while preserving model performance across a range of testsets with
varying difficulty levels, such as GSM8K, MATH500, GPQA, and AIME.