Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Paper page - Stronger Models are NOT Stronger Teachers for Instruction Tuning
[go: Go Back, main page]

Librarian Bot. I found the following papers similar to this paper.

\n

The following papers were recommended by the Semantic Scholar API

\n\n

Please give a thumbs up to this comment if you found it helpful!

\n

If you want recommendations for any Paper on Hugging Face checkout this Space

\n

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend

\n","updatedAt":"2024-11-14T01:33:34.425Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":317,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7660465836524963},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2411.07133","authors":[{"_id":"6734679ad3521b36246eb784","user":{"_id":"653df1323479e9ebbe3eb6cc","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/653df1323479e9ebbe3eb6cc/K_g-r1iMRNKj99LXPuYF3.jpeg","isPro":true,"fullname":"Zhangchen Xu","user":"zhangchenxu","type":"user"},"name":"Zhangchen Xu","status":"claimed_verified","statusLastChangedAt":"2024-11-15T09:24:56.226Z","hidden":false},{"_id":"6734679ad3521b36246eb785","user":{"_id":"6531e1021dd8ebbdc1a6fd8e","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6531e1021dd8ebbdc1a6fd8e/lIcl7zCPtzRsfiUh6uY1o.jpeg","isPro":false,"fullname":"Fengqing Jiang","user":"fqjiang","type":"user"},"name":"Fengqing Jiang","status":"claimed_verified","statusLastChangedAt":"2024-11-15T09:25:03.632Z","hidden":false},{"_id":"6734679ad3521b36246eb786","user":{"_id":"666dfd4770f5a2cb4aefd12f","avatarUrl":"/avatars/fa0e0dbc203a21e58dda8fdb4cbc67ad.svg","isPro":false,"fullname":"Luyao Niu","user":"LNIU","type":"user"},"name":"Luyao Niu","status":"claimed_verified","statusLastChangedAt":"2025-07-15T19:14:30.100Z","hidden":false},{"_id":"6734679ad3521b36246eb787","user":{"_id":"607f666a4ad99100d63ce35c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/607f666a4ad99100d63ce35c/QxhxnvfeV6efkxwUFHwjI.png","isPro":false,"fullname":"Bill Yuchen Lin","user":"yuchenlin","type":"user"},"name":"Bill Yuchen Lin","status":"extracted_pending","statusLastChangedAt":"2024-11-13T08:47:23.435Z","hidden":false},{"_id":"6734679ad3521b36246eb788","name":"Radha Poovendran","hidden":false}],"publishedAt":"2024-11-11T17:06:48.000Z","submittedOnDailyAt":"2024-11-13T06:17:49.344Z","title":"Stronger Models are NOT Stronger Teachers for Instruction Tuning","submittedOnDailyBy":{"_id":"607f666a4ad99100d63ce35c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/607f666a4ad99100d63ce35c/QxhxnvfeV6efkxwUFHwjI.png","isPro":false,"fullname":"Bill Yuchen Lin","user":"yuchenlin","type":"user"},"summary":"Instruction tuning has been widely adopted to ensure large language models\n(LLMs) follow user instructions effectively. The resulting\ninstruction-following capabilities of LLMs heavily rely on the instruction\ndatasets used for tuning. Recently, synthetic instruction datasets have emerged\nas an economically viable solution to provide LLMs diverse and high-quality\ninstructions. However, existing approaches typically assume that larger or\nstronger models are stronger teachers for instruction tuning, and hence simply\nadopt these models as response generators to the synthetic instructions. In\nthis paper, we challenge this commonly-adopted assumption. Our extensive\nexperiments across five base models and twenty response generators reveal that\nlarger and stronger models are not necessarily stronger teachers of smaller\nmodels. We refer to this phenomenon as the Larger Models' Paradox. We observe\nthat existing metrics cannot precisely predict the effectiveness of response\ngenerators since they ignore the compatibility between teachers and base models\nbeing fine-tuned. We thus develop a novel metric, named as\nCompatibility-Adjusted Reward (CAR) to measure the effectiveness of response\ngenerators. Our experiments across five base models demonstrate that CAR\noutperforms almost all baselines.","upvotes":38,"discussionId":"6734679bd3521b36246eb7d0","ai_summary":"The Larger Models' Paradox reveals that larger models are not always better teachers for fine-tuning smaller models, and a new metric, Compatibility-Adjusted Reward (CAR), is introduced to measure and improve the effectiveness of response generators.","ai_keywords":["instruction tuning","large language models (LLMs)","instruction-following capabilities","instruction datasets","synthetic instruction datasets","response generators","Larger Models' Paradox","Compatibility-Adjusted Reward (CAR)"]},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"62567c86d444a9b5a0ec51c1","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62567c86d444a9b5a0ec51c1/1vXJf2uGztPcXpkwyTBr6.png","isPro":false,"fullname":"Dongfu Jiang","user":"DongfuJiang","type":"user"},{"_id":"620783f24e28382272337ba4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/620783f24e28382272337ba4/zkUveQPNiDfYjgGhuFErj.jpeg","isPro":false,"fullname":"GuoLiangTang","user":"Tommy930","type":"user"},{"_id":"648eb1eb59c4e5c87dc116e0","avatarUrl":"/avatars/c636cea39c2c0937f01398c94ead5dad.svg","isPro":false,"fullname":"fdsqefsgergd","user":"T-representer","type":"user"},{"_id":"643b19f8a856622f978df30f","avatarUrl":"/avatars/c82779fdf94f80cdb5020504f83c818b.svg","isPro":false,"fullname":"Yatharth Sharma","user":"YaTharThShaRma999","type":"user"},{"_id":"653df1323479e9ebbe3eb6cc","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/653df1323479e9ebbe3eb6cc/K_g-r1iMRNKj99LXPuYF3.jpeg","isPro":true,"fullname":"Zhangchen Xu","user":"zhangchenxu","type":"user"},{"_id":"6039478ab3ecf716b1a5fd4d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6039478ab3ecf716b1a5fd4d/_Thy4E7taiSYBLKxEKJbT.jpeg","isPro":true,"fullname":"taesiri","user":"taesiri","type":"user"},{"_id":"6270324ebecab9e2dcf245de","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6270324ebecab9e2dcf245de/cMbtWSasyNlYc9hvsEEzt.jpeg","isPro":false,"fullname":"Kye Gomez","user":"kye","type":"user"},{"_id":"612856f6875296178eccf491","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1630033724590-612856f6875296178eccf491.jpeg","isPro":false,"fullname":"rin2401","user":"rin2401","type":"user"},{"_id":"6531e1021dd8ebbdc1a6fd8e","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6531e1021dd8ebbdc1a6fd8e/lIcl7zCPtzRsfiUh6uY1o.jpeg","isPro":false,"fullname":"Fengqing Jiang","user":"fqjiang","type":"user"},{"_id":"6439f43a1514b7ee7fb616a1","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6439f43a1514b7ee7fb616a1/aFhmyAoicv3zcWKYZ27Z_.png","isPro":true,"fullname":"Jeonghwan Park","user":"maywell","type":"user"},{"_id":"641b754d1911d3be6745cce9","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/641b754d1911d3be6745cce9/Ydjcjd4VuNUGj5Cd4QHdB.png","isPro":false,"fullname":"atayloraerospace","user":"Taylor658","type":"user"},{"_id":"631c386bc73939ffc0716a37","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1662793811119-noauth.jpeg","isPro":false,"fullname":"SeongWan Kim","user":"idgmatrix","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":1}">
Papers
arxiv:2411.07133

Stronger Models are NOT Stronger Teachers for Instruction Tuning

Published on Nov 11, 2024
· Submitted by
Bill Yuchen Lin
on Nov 13, 2024
#1 Paper of the day
Authors:

Abstract

The Larger Models' Paradox reveals that larger models are not always better teachers for fine-tuning smaller models, and a new metric, Compatibility-Adjusted Reward (CAR), is introduced to measure and improve the effectiveness of response generators.

AI-generated summary

Instruction tuning has been widely adopted to ensure large language models (LLMs) follow user instructions effectively. The resulting instruction-following capabilities of LLMs heavily rely on the instruction datasets used for tuning. Recently, synthetic instruction datasets have emerged as an economically viable solution to provide LLMs diverse and high-quality instructions. However, existing approaches typically assume that larger or stronger models are stronger teachers for instruction tuning, and hence simply adopt these models as response generators to the synthetic instructions. In this paper, we challenge this commonly-adopted assumption. Our extensive experiments across five base models and twenty response generators reveal that larger and stronger models are not necessarily stronger teachers of smaller models. We refer to this phenomenon as the Larger Models' Paradox. We observe that existing metrics cannot precisely predict the effectiveness of response generators since they ignore the compatibility between teachers and base models being fine-tuned. We thus develop a novel metric, named as Compatibility-Adjusted Reward (CAR) to measure the effectiveness of response generators. Our experiments across five base models demonstrate that CAR outperforms almost all baselines.

Community

Paper author Paper submitter
This comment has been hidden

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 4

Datasets citing this paper 3

Spaces citing this paper 8

Collections including this paper 8