Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Paper page - Reasoning with Confidence: Efficient Verification of LLM Reasoning Steps via Uncertainty Heads
[go: Go Back, main page]

Librarian Bot. I found the following papers similar to this paper.

\n

The following papers were recommended by the Semantic Scholar API

\n\n

Please give a thumbs up to this comment if you found it helpful!

\n

If you want recommendations for any Paper on Hugging Face checkout this Space

\n

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend

\n","updatedAt":"2025-11-12T01:34:54.333Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7079668045043945},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2511.06209","authors":[{"_id":"6912f325a644ba07c499c854","name":"Jingwei Ni","hidden":false},{"_id":"6912f325a644ba07c499c855","name":"Ekaterina Fadeeva","hidden":false},{"_id":"6912f325a644ba07c499c856","name":"Tianyi Wu","hidden":false},{"_id":"6912f325a644ba07c499c857","name":"Mubashara Akhtar","hidden":false},{"_id":"6912f325a644ba07c499c858","name":"Jiaheng Zhang","hidden":false},{"_id":"6912f325a644ba07c499c859","name":"Elliott Ash","hidden":false},{"_id":"6912f325a644ba07c499c85a","name":"Markus Leippold","hidden":false},{"_id":"6912f325a644ba07c499c85b","name":"Timothy Baldwin","hidden":false},{"_id":"6912f325a644ba07c499c85c","name":"See-Kiong Ng","hidden":false},{"_id":"6912f325a644ba07c499c85d","name":"Artem Shelmanov","hidden":false},{"_id":"6912f325a644ba07c499c85e","name":"Mrinmaya Sachan","hidden":false}],"publishedAt":"2025-11-09T03:38:29.000Z","submittedOnDailyAt":"2025-11-11T06:01:25.860Z","title":"Reasoning with Confidence: Efficient Verification of LLM Reasoning Steps\n via Uncertainty Heads","submittedOnDailyBy":{"_id":"685b9b62e896e7627649bd2f","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/YFxwvCmegiZHTaWif-xd1.png","isPro":false,"fullname":"Tianyi Wu","user":"tianyiwuhaha","type":"user"},"summary":"Solving complex tasks usually requires LLMs to generate long multi-step\nreasoning chains. Previous work has shown that verifying the correctness of\nindividual reasoning steps can further improve the performance and efficiency\nof LLMs on such tasks and enhance solution interpretability. However, existing\nverification approaches, such as Process Reward Models (PRMs), are either\ncomputationally expensive, limited to specific domains, or require large-scale\nhuman or model-generated annotations. Thus, we propose a lightweight\nalternative for step-level reasoning verification based on data-driven\nuncertainty scores. We train transformer-based uncertainty quantification heads\n(UHeads) that use the internal states of a frozen LLM to estimate the\nuncertainty of its reasoning steps during generation. The approach is fully\nautomatic: target labels are generated either by another larger LLM (e.g.,\nDeepSeek R1) or in a self-supervised manner by the original model itself.\nUHeads are both effective and lightweight, containing less than 10M parameters.\nAcross multiple domains, including mathematics, planning, and general knowledge\nquestion answering, they match or even surpass the performance of PRMs that are\nup to 810x larger. Our findings suggest that the internal states of LLMs encode\ntheir uncertainty and can serve as reliable signals for reasoning verification,\noffering a promising direction toward scalable and generalizable introspective\nLLMs.","upvotes":19,"discussionId":"6912f326a644ba07c499c85f","ai_summary":"Transformer-based uncertainty quantification heads improve step-level reasoning verification in LLMs by estimating uncertainty from internal states, offering a lightweight and scalable alternative to existing methods.","ai_keywords":["LLMs","reasoning chains","verification","Process Reward Models","uncertainty scores","transformer-based","uncertainty quantification heads","UHeads","self-supervised","generalizable introspective LLMs"]},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"685b9b62e896e7627649bd2f","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/YFxwvCmegiZHTaWif-xd1.png","isPro":false,"fullname":"Tianyi Wu","user":"tianyiwuhaha","type":"user"},{"_id":"67eec9446d0de3e1f28898bf","avatarUrl":"/avatars/a62db483fc63f1d4bddb573d058d42db.svg","isPro":false,"fullname":"liu","user":"xiao-hao","type":"user"},{"_id":"63b135c0d5f79ac67d21c599","avatarUrl":"/avatars/06f66599e417d6f4ffee111fb4414559.svg","isPro":false,"fullname":"Liu","user":"Renyang","type":"user"},{"_id":"6486b09e8315b19342f0bf5e","avatarUrl":"/avatars/bc5f22f231c884146d373fe1042d81bd.svg","isPro":false,"fullname":"Xiangyan Liu","user":"xyliu6","type":"user"},{"_id":"626d268d5f7327906f05cad1","avatarUrl":"/avatars/18bda74612a3ee63a17f991bcc695106.svg","isPro":true,"fullname":"Zijian Wu","user":"Jakumetsu","type":"user"},{"_id":"6650c77a74664a42ddfb9187","avatarUrl":"/avatars/92001bbe0ae9b14309730316b639cede.svg","isPro":false,"fullname":"yueliu1999","user":"yueliu1999","type":"user"},{"_id":"68ac2314d4a7fd30a2ec0035","avatarUrl":"/avatars/2b927fedd6cc376807a771e37124e331.svg","isPro":false,"fullname":"Zhiwei Xue","user":"ZackAXue","type":"user"},{"_id":"650d27ebf1b6239760d577ed","avatarUrl":"/avatars/af60f84707641ff611d3efda4abf8c54.svg","isPro":false,"fullname":"Michael Chan","user":"michaelcwt","type":"user"},{"_id":"61711f02e0b1ddb56eb9b526","avatarUrl":"/avatars/3e2fdf774f5bc1f73b450486d6da42d4.svg","isPro":true,"fullname":"Mingzhe Du","user":"Elfsong","type":"user"},{"_id":"6412017d1e42164b9f101399","avatarUrl":"/avatars/9d1e39750666a48aad5b4b7082295731.svg","isPro":false,"fullname":"Mubashara Akhtar","user":"Mubashara","type":"user"},{"_id":"66d8512c54209e9101811e8e","avatarUrl":"/avatars/62dfd8e6261108f2508efe678d5a2a57.svg","isPro":false,"fullname":"M Saad Salman","user":"MSS444","type":"user"},{"_id":"67da1bd7212f3d15a4df47fb","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/Uabo7uMuyFbUeOUmtnRvf.png","isPro":false,"fullname":"lol","user":"Chengran98","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0}">
Papers
arxiv:2511.06209

Reasoning with Confidence: Efficient Verification of LLM Reasoning Steps via Uncertainty Heads

Published on Nov 9, 2025
· Submitted by
Tianyi Wu
on Nov 11, 2025
Authors:
,
,
,
,
,
,
,
,
,
,

Abstract

Transformer-based uncertainty quantification heads improve step-level reasoning verification in LLMs by estimating uncertainty from internal states, offering a lightweight and scalable alternative to existing methods.

AI-generated summary

Solving complex tasks usually requires LLMs to generate long multi-step reasoning chains. Previous work has shown that verifying the correctness of individual reasoning steps can further improve the performance and efficiency of LLMs on such tasks and enhance solution interpretability. However, existing verification approaches, such as Process Reward Models (PRMs), are either computationally expensive, limited to specific domains, or require large-scale human or model-generated annotations. Thus, we propose a lightweight alternative for step-level reasoning verification based on data-driven uncertainty scores. We train transformer-based uncertainty quantification heads (UHeads) that use the internal states of a frozen LLM to estimate the uncertainty of its reasoning steps during generation. The approach is fully automatic: target labels are generated either by another larger LLM (e.g., DeepSeek R1) or in a self-supervised manner by the original model itself. UHeads are both effective and lightweight, containing less than 10M parameters. Across multiple domains, including mathematics, planning, and general knowledge question answering, they match or even surpass the performance of PRMs that are up to 810x larger. Our findings suggest that the internal states of LLMs encode their uncertainty and can serve as reliable signals for reasoning verification, offering a promising direction toward scalable and generalizable introspective LLMs.

Community

Paper submitter

Solving complex tasks usually requires LLMs to generate long multi-step reasoning chains. Previous work has shown that verifying the correctness of individual reasoning steps can further improve the performance and efficiency of LLMs on such tasks and enhance solution interpretability. However, existing verification approaches, such as Process Reward Models (PRMs), are either computationally expensive, limited to specific domains, or require large-scale human or model-generated annotations. Thus, we propose a lightweight alternative for steplevel reasoning verification based on data-driven uncertainty estimation. We train transformer-based uncertainty quantification heads (UHeads) that use the internal states of the frozen LLM to estimate the uncertainty of its reasoning steps during generation. The approach is fully automatic: target labels are generated either by another larger LLM (e.g., DeepSeek R1) or in a self-supervised manner by the original model itself. UHeads are both effective and lightweight, containing less than 10M parameters. Across multiple domains, including mathematics, planning, and general knowledge question answering, they match or even exceed the performance of PRMs that are up to 810× larger. Our findings suggest that the internal states of LLMs encode their uncertainty and can serve as reliable signals for reasoning verification, offering a promising direction towards scalable and generalizable introspective LLMs.

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2511.06209 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2511.06209 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2511.06209 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.