Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Paper page - Uncertainty-Aware Gradient Signal-to-Noise Data Selection for Instruction Tuning
[go: Go Back, main page]

Librarian Bot. I found the following papers similar to this paper.

\n

The following papers were recommended by the Semantic Scholar API

\n\n

Please give a thumbs up to this comment if you found it helpful!

\n

If you want recommendations for any Paper on Hugging Face checkout this Space

\n

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend

\n","updatedAt":"2026-01-22T01:37:58.476Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7096481919288635},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2601.13697","authors":[{"_id":"6970e353572de08d9c7ae90d","name":"Zhihang Yuan","hidden":false},{"_id":"6970e353572de08d9c7ae90e","name":"Chengyu Yue","hidden":false},{"_id":"6970e353572de08d9c7ae90f","name":"Long Huang","hidden":false},{"_id":"6970e353572de08d9c7ae910","name":"Litu Ou","hidden":false},{"_id":"6970e353572de08d9c7ae911","name":"Lei Shi","hidden":false}],"publishedAt":"2026-01-20T07:51:32.000Z","submittedOnDailyAt":"2026-01-21T12:06:08.981Z","title":"Uncertainty-Aware Gradient Signal-to-Noise Data Selection for Instruction Tuning","submittedOnDailyBy":{"_id":"622f2feea32d46b4be9ed8c4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/NDeZQZQK5U-9m10yQwDVf.png","isPro":false,"fullname":"Litu Ou","user":"learn3r","type":"user"},"summary":"Instruction tuning is a standard paradigm for adapting large language models (LLMs), but modern instruction datasets are large, noisy, and redundant, making full-data fine-tuning costly and often unnecessary. Existing data selection methods either build expensive gradient datastores or assign static scores from a weak proxy, largely ignoring evolving uncertainty, and thus missing a key source of LLM interpretability. We propose GRADFILTERING, an objective-agnostic, uncertainty-aware data selection framework that utilizes a small GPT-2 proxy with a LoRA ensemble and aggregates per-example gradients into a Gradient Signal-to-Noise Ratio (G-SNR) utility. Our method matches or surpasses random subsets and strong baselines in most LLM-as-a-judge evaluations as well as in human assessment. Moreover, GRADFILTERING-selected subsets converge faster than competitive filters under the same compute budget, reflecting the benefit of uncertainty-aware scoring.","upvotes":4,"discussionId":"6970e353572de08d9c7ae912","ai_summary":"GRADFILTERING is an uncertainty-aware data selection framework for instruction tuning that uses gradient signal-to-noise ratio to improve LLM adaptation efficiency and performance.","ai_keywords":["instruction tuning","large language models","data selection","gradient datastore","LoRA ensemble","Gradient Signal-to-Noise Ratio","G-SNR","uncertainty-aware scoring","objective-agnostic","LLM-as-a-judge evaluations"],"organization":{"_id":"64488b334988ee01f2a8d856","name":"alibaba-inc","fullname":"alibaba-inc","avatar":"https://cdn-uploads.huggingface.co/production/uploads/61ac8f8a00d01045fca0ad2f/MX4wxQVaFm1A1wqnrL2WU.jpeg"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"622f2feea32d46b4be9ed8c4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/NDeZQZQK5U-9m10yQwDVf.png","isPro":false,"fullname":"Litu Ou","user":"learn3r","type":"user"},{"_id":"662bf1f1b7c202c08414ecb4","avatarUrl":"/avatars/a31097a9b801adef746ad499254c7a86.svg","isPro":false,"fullname":"Lou","user":"shipWr3ck","type":"user"},{"_id":"664cb66017586a96342785c0","avatarUrl":"/avatars/a8fe303411c8c2f0bbd309b15a4c0026.svg","isPro":false,"fullname":"Wei Liu","user":"lefutonku","type":"user"},{"_id":"684fe6162872b5f3a7238d2c","avatarUrl":"/avatars/619c811c311ee735b2be63488ec394c9.svg","isPro":false,"fullname":"Zhihang Yuan","user":"Zhihang666","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0,"organization":{"_id":"64488b334988ee01f2a8d856","name":"alibaba-inc","fullname":"alibaba-inc","avatar":"https://cdn-uploads.huggingface.co/production/uploads/61ac8f8a00d01045fca0ad2f/MX4wxQVaFm1A1wqnrL2WU.jpeg"}}">
Papers
arxiv:2601.13697

Uncertainty-Aware Gradient Signal-to-Noise Data Selection for Instruction Tuning

Published on Jan 20
· Submitted by
Litu Ou
on Jan 21
Authors:
,
,
,
,

Abstract

GRADFILTERING is an uncertainty-aware data selection framework for instruction tuning that uses gradient signal-to-noise ratio to improve LLM adaptation efficiency and performance.

AI-generated summary

Instruction tuning is a standard paradigm for adapting large language models (LLMs), but modern instruction datasets are large, noisy, and redundant, making full-data fine-tuning costly and often unnecessary. Existing data selection methods either build expensive gradient datastores or assign static scores from a weak proxy, largely ignoring evolving uncertainty, and thus missing a key source of LLM interpretability. We propose GRADFILTERING, an objective-agnostic, uncertainty-aware data selection framework that utilizes a small GPT-2 proxy with a LoRA ensemble and aggregates per-example gradients into a Gradient Signal-to-Noise Ratio (G-SNR) utility. Our method matches or surpasses random subsets and strong baselines in most LLM-as-a-judge evaluations as well as in human assessment. Moreover, GRADFILTERING-selected subsets converge faster than competitive filters under the same compute budget, reflecting the benefit of uncertainty-aware scoring.

Community

Paper submitter

Instruction tuning is a standard paradigm for adapting large language models (LLMs), but modern instruction datasets are large, noisy, and redundant, making full-data fine-tuning costly and often unnecessary. Existing data selection methods either build expensive gradient datastores or assign static scores from a weak proxy, largely ignoring evolving uncertainty, and thus missing a key source of LLM interpretability. We propose GRADFILTERING, an objective-agnostic, uncertainty-aware data selection framework that utilizes a small GPT-2 proxy with a LoRA ensemble and aggregates per-example gradients into a Gradient Signal-to-Noise Ratio (G-SNR) utility. Our method matches or surpasses random subsets and strong baselines in most LLM-as-a-judge evaluations as well as in human assessment. Moreover, GRADFILTERING-selected subsets converge faster than competitive filters under the same compute budget, reflecting the benefit of uncertainty-aware scoring.

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2601.13697 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2601.13697 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2601.13697 in a Space README.md to link it from this page.

Collections including this paper 1