Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456 Paper page - A Tale of Tails: Model Collapse as a Change of Scaling Laws
Please give a thumbs up to this comment if you found it helpful!
\n
If you want recommendations for any Paper on Hugging Face checkout this Space
\n
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend
\n","updatedAt":"2024-02-14T01:22:28.153Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7629669308662415},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2402.07043","authors":[{"_id":"65cad9a451c1738a55841e3c","name":"Elvis Dohmatob","hidden":false},{"_id":"65cad9a451c1738a55841e3d","user":{"_id":"65cbfa6c968742be942e6cba","avatarUrl":"/avatars/1a6cc0983edc28fa92178d3abc283ba1.svg","isPro":false,"fullname":"Feng","user":"Yunzhen","type":"user"},"name":"Yunzhen Feng","status":"claimed_verified","statusLastChangedAt":"2024-02-14T08:42:56.044Z","hidden":false},{"_id":"65cad9a451c1738a55841e3e","user":{"_id":"64632ebd7e9025b09bd55be2","avatarUrl":"/avatars/396935c4fb9c077cea35ca2337620e33.svg","isPro":false,"fullname":"Pu Yang","user":"yangpuPKU","type":"user"},"name":"Pu Yang","status":"admin_assigned","statusLastChangedAt":"2024-02-13T12:16:59.457Z","hidden":false},{"_id":"65cad9a451c1738a55841e3f","name":"Francois Charton","hidden":false},{"_id":"65cad9a451c1738a55841e40","name":"Julia Kempe","hidden":false}],"publishedAt":"2024-02-10T21:06:34.000Z","submittedOnDailyAt":"2024-02-13T00:23:25.663Z","title":"A Tale of Tails: Model Collapse as a Change of Scaling Laws","submittedOnDailyBy":{"_id":"60f1abe7544c2adfd699860c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674929746905-60f1abe7544c2adfd699860c.jpeg","isPro":false,"fullname":"AK","user":"akhaliq","type":"user"},"summary":"As AI model size grows, neural scaling laws have become a crucial tool to\npredict the improvements of large models when increasing capacity and the size\nof original (human or natural) training data. Yet, the widespread use of\npopular models means that the ecosystem of online data and text will co-evolve\nto progressively contain increased amounts of synthesized data. In this paper\nwe ask: How will the scaling laws change in the inevitable regime where\nsynthetic data makes its way into the training corpus? Will future models,\nstill improve, or be doomed to degenerate up to total (model) collapse? We\ndevelop a theoretical framework of model collapse through the lens of scaling\nlaws. We discover a wide range of decay phenomena, analyzing loss of scaling,\nshifted scaling with number of generations, the ''un-learning\" of skills, and\ngrokking when mixing human and synthesized data. Our theory is validated by\nlarge-scale experiments with a transformer on an arithmetic task and text\ngeneration using the large language model Llama2.","upvotes":15,"discussionId":"65cad9a551c1738a55841e63","ai_summary":"Theoretical analysis and experiments explore how the integration of synthetic data into training datasets affects neural scaling laws and model performance.","ai_keywords":["neural scaling laws","model collapse","loss of scaling","shifted scaling","un-learning","grokking","transformer","arithmetic task","text generation","Llama2"]},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"630c2ddb86b8b9904c3860a6","avatarUrl":"/avatars/9b6cec2e9e269ccac1533eb7bf1ac2c5.svg","isPro":false,"fullname":"Igor Melnyk","user":"imelnyk","type":"user"},{"_id":"6538119803519fddb4a17e10","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6538119803519fddb4a17e10/ffJMkdx-rM7VvLTCM6ri_.jpeg","isPro":false,"fullname":"samusenps","user":"samusenps","type":"user"},{"_id":"6079cc1c65b9d0165cb18394","avatarUrl":"/avatars/8c1f1011d9f675fc899919cf07faef68.svg","isPro":false,"fullname":"Chris Lesniewski","user":"lesniewski","type":"user"},{"_id":"6311bca0ae8896941da24e66","avatarUrl":"/avatars/48de64894fc3c9397e26e4d6da3ff537.svg","isPro":false,"fullname":"Fynn Kröger","user":"fynnkroeger","type":"user"},{"_id":"65caf9d4ab05dc46af858758","avatarUrl":"/avatars/509afd88a0839babe9a0d0994dfea755.svg","isPro":false,"fullname":"Brad Dowling","user":"grayship","type":"user"},{"_id":"6361824d6483eb3832c8ef15","avatarUrl":"/avatars/fb5ab16646f1dc8cfb024ac3ccf3e800.svg","isPro":false,"fullname":"Nikos Tsilivis","user":"nikosts","type":"user"},{"_id":"65cb7c265e60df7739896353","avatarUrl":"/avatars/e8ceb7c9207d42e1870ff522466bfb20.svg","isPro":false,"fullname":"Anna Survey","user":"AnnaSurvey","type":"user"},{"_id":"65cbbfc705cc358d073a64c3","avatarUrl":"/avatars/45a74e709550076245f7e030b2ba8607.svg","isPro":false,"fullname":"Tim G. J. Rudner","user":"timabee","type":"user"},{"_id":"620783f24e28382272337ba4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/620783f24e28382272337ba4/zkUveQPNiDfYjgGhuFErj.jpeg","isPro":false,"fullname":"GuoLiangTang","user":"Tommy930","type":"user"},{"_id":"65ce30e06da01df536eded5a","avatarUrl":"/avatars/04c32cba7a3bbaf9ea5dee88c96cf87b.svg","isPro":false,"fullname":"Julia Kempe","user":"Knykny","type":"user"},{"_id":"64ca7c04710645aa7bdbbfff","avatarUrl":"/avatars/c12f4cb6dc1ff0010edb3ef4cfcccd7c.svg","isPro":false,"fullname":"Lize Pirenne","user":"Inversta","type":"user"},{"_id":"61c98b68e3d96b1fa2fd0b6a","avatarUrl":"/avatars/8860b175ae0d292bb5ad8502a97b9b9f.svg","isPro":false,"fullname":"Mous","user":"Anony","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0}">
Theoretical analysis and experiments explore how the integration of synthetic data into training datasets affects neural scaling laws and model performance.
AI-generated summary
As AI model size grows, neural scaling laws have become a crucial tool to
predict the improvements of large models when increasing capacity and the size
of original (human or natural) training data. Yet, the widespread use of
popular models means that the ecosystem of online data and text will co-evolve
to progressively contain increased amounts of synthesized data. In this paper
we ask: How will the scaling laws change in the inevitable regime where
synthetic data makes its way into the training corpus? Will future models,
still improve, or be doomed to degenerate up to total (model) collapse? We
develop a theoretical framework of model collapse through the lens of scaling
laws. We discover a wide range of decay phenomena, analyzing loss of scaling,
shifted scaling with number of generations, the ''un-learning" of skills, and
grokking when mixing human and synthesized data. Our theory is validated by
large-scale experiments with a transformer on an arithmetic task and text
generation using the large language model Llama2.