Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Streaming-dLLM: Accelerating Diffusion LLMs via Suffix Pruning and Dynamic Decoding (2026)
TEAM: Temporal-Spatial Consistency Guided Expert Activation for MoE Diffusion Language Model Acceleration (2026)
DAWN: Dependency-Aware Fast Inference for Diffusion LLMs (2026)
CD4LM: Consistency Distillation and aDaptive Decoding for Diffusion Language Models (2026)
Advancing Block Diffusion Language Models for Test-Time Scaling (2026)
Beyond Hard Masks: Progressive Token Evolution for Diffusion Language Models (2026)
DSB: Dynamic Sliding Block Scheduling for Diffusion LLMs (2026)

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend

\n","updatedAt":"2026-02-14T01:41:09.508Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.6732098460197449},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[{"reaction":"👍","users":["FSCCS"],"count":1}],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2602.12153","authors":[{"_id":"698e9c44cace060ff123ae0f","user":{"_id":"67a4a26d5e65aa63c6d30e68","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/67a4a26d5e65aa63c6d30e68/GtodlJGw-_IL2DTXQTucz.jpeg","isPro":false,"fullname":"Sicheng Feng","user":"FSCCS","type":"user"},"name":"Sicheng Feng","status":"claimed_verified","statusLastChangedAt":"2026-02-13T09:36:23.653Z","hidden":false},{"_id":"698e9c44cace060ff123ae10","name":"Zigeng Chen","hidden":false},{"_id":"698e9c44cace060ff123ae11","name":"Xinyin Ma","hidden":false},{"_id":"698e9c44cace060ff123ae12","name":"Gongfan Fang","hidden":false},{"_id":"698e9c44cace060ff123ae13","name":"Xinchao Wang","hidden":false}],"publishedAt":"2026-02-12T16:35:05.000Z","submittedOnDailyAt":"2026-02-13T01:11:17.731Z","title":"dVoting: Fast Voting for dLLMs","submittedOnDailyBy":{"_id":"67a4a26d5e65aa63c6d30e68","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/67a4a26d5e65aa63c6d30e68/GtodlJGw-_IL2DTXQTucz.jpeg","isPro":false,"fullname":"Sicheng Feng","user":"FSCCS","type":"user"},"summary":"Diffusion Large Language Models (dLLMs) represent a new paradigm beyond autoregressive modeling, offering competitive performance while naturally enabling a flexible decoding process. Specifically, dLLMs can generate tokens at arbitrary positions in parallel, endowing them with significant potential for parallel test-time scaling, which was previously constrained by severe inefficiency in autoregressive modeling. In this work, we introduce dVoting, a fast voting technique that boosts reasoning capability without training, with only an acceptable extra computational overhead. dVoting is motivated by the observation that, across multiple samples for the same prompt, token predictions remain largely consistent, whereas performance is determined by a small subset of tokens exhibiting cross-sample variability. Leveraging the arbitrary-position generation capability of dLLMs, dVoting performs iterative refinement by sampling, identifying uncertain tokens via consistency analysis, regenerating them through voting, and repeating this process until convergence. Extensive evaluations demonstrate that dVoting consistently improves performance across various benchmarks. It achieves gains of 6.22%-7.66% on GSM8K, 4.40%-7.20% on MATH500, 3.16%-14.84% on ARC-C, and 4.83%-5.74% on MMLU. Our code is available at https://github.com/fscdc/dVoting","upvotes":20,"discussionId":"698e9c44cace060ff123ae14","projectPage":"https://fscdc.github.io/dVoting/","githubRepo":"https://github.com/fscdc/dVoting","githubRepoAddedBy":"user","ai_summary":"Diffusion large language models enable parallel token generation and efficient reasoning enhancement through a voting technique that identifies and refines uncertain predictions across multiple samples.","ai_keywords":["diffusion large language models","autoregressive modeling","parallel test-time scaling","token predictions","iterative refinement","consistency analysis","voting technique"],"githubStars":23,"organization":{"_id":"6508ab2b349930913196378b","name":"NationalUniversityofSingapore","fullname":"National University of Singapore","avatar":"https://cdn-uploads.huggingface.co/production/uploads/630ca0817dacb93b33506ce7/ZYUmpSMsa5Whihw3me2Bw.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"67a4a26d5e65aa63c6d30e68","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/67a4a26d5e65aa63c6d30e68/GtodlJGw-_IL2DTXQTucz.jpeg","isPro":false,"fullname":"Sicheng Feng","user":"FSCCS","type":"user"},{"_id":"640ebdfefdeaae139086f4d8","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/640ebdfefdeaae139086f4d8/2N94gbHubplYD8njmUTPf.jpeg","isPro":true,"fullname":"Zhenxiong Tan","user":"Yuanshi","type":"user"},{"_id":"634cfebc350bcee9bed20a4d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/634cfebc350bcee9bed20a4d/fN47nN5rhw-HJaFLBZWQy.png","isPro":false,"fullname":"Xingyi Yang","user":"adamdad","type":"user"},{"_id":"646a1939c37ca1e12308fe81","avatarUrl":"/avatars/752e9d86018e7d33ad8bcd741203fd86.svg","isPro":false,"fullname":"Gongfan Fang","user":"Vinnnf","type":"user"},{"_id":"663a304eda13660b8474f524","avatarUrl":"/avatars/4e0866c15b7b8e584bd18c9a2ee08d27.svg","isPro":false,"fullname":"XuanCheng","user":"XuanCheg","type":"user"},{"_id":"689cb792f522165a63e55e4f","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/689cb792f522165a63e55e4f/LIQv_bkx7rqZLax8CAuyV.jpeg","isPro":false,"fullname":"Haiquan Lu","user":"haiquanlu","type":"user"},{"_id":"655469586bc4180700cf7a34","avatarUrl":"/avatars/252392d0c45783d8f149feac7a6215ec.svg","isPro":false,"fullname":"Kejia Zhang","user":"KejiaRobust","type":"user"},{"_id":"668e740f1173ab43d9d9ed5e","avatarUrl":"/avatars/caa9b47c2a5f6d6d679759b8b234a0ab.svg","isPro":false,"fullname":"Zeqing Wang","user":"INV-WZQ","type":"user"},{"_id":"66dbea44946bce6c94afac80","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/66dbea44946bce6c94afac80/MWL4AJEqs8XUVyEAX3QqN.png","isPro":false,"fullname":"Haolei Bai","user":"DeadlyKitt3n","type":"user"},{"_id":"5df833bdda6d0311fd3d5403","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/5df833bdda6d0311fd3d5403/62OtGJEQXdOuhV9yCd4HS.png","isPro":false,"fullname":"Weihao Yu","user":"whyu","type":"user"},{"_id":"66aa39349238d9c3a1c7f9dc","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/66aa39349238d9c3a1c7f9dc/mj6r7uxEYXM502x296UMf.jpeg","isPro":false,"fullname":"Xin Jin","user":"Xin1118","type":"user"},{"_id":"66966286ad7167254c4bb5d6","avatarUrl":"/avatars/1a3136918a74d7ce778dcee0ca93c411.svg","isPro":false,"fullname":"Kele Shao","user":"cokeshao","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0,"organization":{"_id":"6508ab2b349930913196378b","name":"NationalUniversityofSingapore","fullname":"National University of Singapore","avatar":"https://cdn-uploads.huggingface.co/production/uploads/630ca0817dacb93b33506ce7/ZYUmpSMsa5Whihw3me2Bw.png"}}">

Papers

arxiv:2602.12153

dVoting: Fast Voting for dLLMs

Published on Feb 12

· Submitted by

Sicheng Feng on Feb 13

National University of Singapore

Upvote

Authors:

Sicheng Feng ,

Abstract

Diffusion large language models enable parallel token generation and efficient reasoning enhancement through a voting technique that identifies and refines uncertain predictions across multiple samples.

AI-generated summary

Diffusion Large Language Models (dLLMs) represent a new paradigm beyond autoregressive modeling, offering competitive performance while naturally enabling a flexible decoding process. Specifically, dLLMs can generate tokens at arbitrary positions in parallel, endowing them with significant potential for parallel test-time scaling, which was previously constrained by severe inefficiency in autoregressive modeling. In this work, we introduce dVoting, a fast voting technique that boosts reasoning capability without training, with only an acceptable extra computational overhead. dVoting is motivated by the observation that, across multiple samples for the same prompt, token predictions remain largely consistent, whereas performance is determined by a small subset of tokens exhibiting cross-sample variability. Leveraging the arbitrary-position generation capability of dLLMs, dVoting performs iterative refinement by sampling, identifying uncertain tokens via consistency analysis, regenerating them through voting, and repeating this process until convergence. Extensive evaluations demonstrate that dVoting consistently improves performance across various benchmarks. It achieves gains of 6.22%-7.66% on GSM8K, 4.40%-7.20% on MATH500, 3.16%-14.84% on ARC-C, and 4.83%-5.74% on MMLU. Our code is available at https://github.com/fscdc/dVoting