Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Paper page - LLaMA Pro: Progressive LLaMA with Block Expansion
[go: Go Back, main page]

Librarian Bot. I found the following papers similar to this paper.

\n

The following papers were recommended by the Semantic Scholar API

\n\n

Please give a thumbs up to this comment if you found it helpful!

\n

If you want recommendations for any Paper on Hugging Face checkout this Space

\n","updatedAt":"2024-01-09T09:55:13.754Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7087116241455078},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}},{"id":"6664ea7947613a01eefe2a44","author":{"_id":"6186ddf6a7717cb375090c01","avatarUrl":"/avatars/716b6a7d1094c8036b2a8a7b9063e8aa.svg","fullname":"Julien BLANCHON","name":"blanchon","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":176,"isUserFollowing":false},"createdAt":"2024-06-08T23:34:17.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"# How LLaMA Pro Revolutionizes AI with Block Expansion\n\nhttps://cdn-uploads.huggingface.co/production/uploads/6186ddf6a7717cb375090c01/1D_JtuQ4u_6qG6mC7evlO.mp4 \n\n## Links πŸ”—:\nπŸ‘‰ Subscribe: https://www.youtube.com/@Arxflix\nπŸ‘‰ Twitter: https://x.com/arxflix\nπŸ‘‰ LMNT (Partner): https://lmnt.com/\n\n\nBy Arxflix\n![9t4iCUHx_400x400-1.jpg](https://cdn-uploads.huggingface.co/production/uploads/6186ddf6a7717cb375090c01/v4S5zBurs0ouGNwYj1GEd.jpeg)","html":"

\n\t\n\t\t\n\t\n\t\n\t\tHow LLaMA Pro Revolutionizes AI with Block Expansion\n\t\n

\n

\n\n

\n\t\n\t\t\n\t\n\t\n\t\tLinks πŸ”—:\n\t\n

\n

πŸ‘‰ Subscribe: https://www.youtube.com/@Arxflix
πŸ‘‰ Twitter: https://x.com/arxflix
πŸ‘‰ LMNT (Partner): https://lmnt.com/

\n

By Arxflix
\"9t4iCUHx_400x400-1.jpg\"

\n","updatedAt":"2024-06-08T23:34:17.928Z","author":{"_id":"6186ddf6a7717cb375090c01","avatarUrl":"/avatars/716b6a7d1094c8036b2a8a7b9063e8aa.svg","fullname":"Julien BLANCHON","name":"blanchon","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":176,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.504166841506958},"editors":["blanchon"],"editorAvatarUrls":["/avatars/716b6a7d1094c8036b2a8a7b9063e8aa.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2401.02415","authors":[{"_id":"65978d26b4b5c254cb8b0924","name":"Chengyue Wu","hidden":false},{"_id":"65978d26b4b5c254cb8b0925","name":"Yukang Gan","hidden":false},{"_id":"65978d26b4b5c254cb8b0926","name":"Yixiao Ge","hidden":false},{"_id":"65978d26b4b5c254cb8b0927","user":{"_id":"635626a8ec32331b227f407b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/635626a8ec32331b227f407b/KRkAEN3eXN_mYzzJQ_8dO.jpeg","isPro":false,"fullname":"LuZeyu","user":"whlzy","type":"user"},"name":"Zeyu Lu","status":"claimed_verified","statusLastChangedAt":"2024-09-04T12:00:53.981Z","hidden":false},{"_id":"65978d26b4b5c254cb8b0928","user":{"_id":"65994517d3f2137415bb4d5b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/65994517d3f2137415bb4d5b/PhjWetLTMn3bThWFLGCxR.png","isPro":false,"fullname":"WANG Jiahao","user":"techmonsterwang","type":"user"},"name":"Jiahao Wang","status":"claimed_verified","statusLastChangedAt":"2024-04-11T09:02:17.717Z","hidden":false},{"_id":"65978d26b4b5c254cb8b0929","name":"Ye Feng","hidden":false},{"_id":"65978d26b4b5c254cb8b092a","name":"Ping Luo","hidden":false},{"_id":"65978d26b4b5c254cb8b092b","name":"Ying Shan","hidden":false}],"publishedAt":"2024-01-04T18:59:12.000Z","submittedOnDailyAt":"2024-01-05T02:31:27.313Z","title":"LLaMA Pro: Progressive LLaMA with Block Expansion","submittedOnDailyBy":{"_id":"60f1abe7544c2adfd699860c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674929746905-60f1abe7544c2adfd699860c.jpeg","isPro":false,"fullname":"AK","user":"akhaliq","type":"user"},"summary":"Humans generally acquire new skills without compromising the old; however,\nthe opposite holds for Large Language Models (LLMs), e.g., from LLaMA to\nCodeLLaMA. To this end, we propose a new post-pretraining method for LLMs with\nan expansion of Transformer blocks. We tune the expanded blocks using only new\ncorpus, efficiently and effectively improving the model's knowledge without\ncatastrophic forgetting. In this paper, we experiment on the corpus of code and\nmath, yielding LLaMA Pro-8.3B, a versatile foundation model initialized from\nLLaMA2-7B, excelling in general tasks, programming, and mathematics. LLaMA Pro\nand its instruction-following counterpart (LLaMA Pro-Instruct) achieve advanced\nperformance among various benchmarks, demonstrating superiority over existing\nopen models in the LLaMA family and the immense potential of reasoning and\naddressing diverse tasks as an intelligent agent. Our findings provide valuable\ninsights into integrating natural and programming languages, laying a solid\nfoundation for developing advanced language agents that operate effectively in\nvarious environments.","upvotes":54,"discussionId":"65978d27b4b5c254cb8b0956","githubRepo":"https://github.com/tencentarc/llama-pro","githubRepoAddedBy":"auto","ai_summary":"A new post-pretraining method using expanded Transformer blocks for Large Language Models improves knowledge without catastrophic forgetting, yielding LLaMA Pro-8.3B that excels in general tasks, programming, and mathematics.","ai_keywords":["Large Language Models","LLMs","post-pretraining method","Transformer blocks","catastrophic forgetting","LLaMA","CodeLLaMA","LLaMA Pro-8.3B","instruction-following","advanced performance","benchmarks","reasoning","intelligent agent"],"githubStars":514},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"612ee6a7b960e78c6d2319d4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/612ee6a7b960e78c6d2319d4/2Hu9BaAyXbyh1vt0v1Qui.jpeg","isPro":false,"fullname":"Qian Liu","user":"SivilTaram","type":"user"},{"_id":"640e9762b03f4cd29f58d982","avatarUrl":"/avatars/81da37d628163fe3e094b247c7c3a3b5.svg","isPro":false,"fullname":"Yixiao Ge","user":"yxgeee","type":"user"},{"_id":"62ecdc18b72a69615d6bd857","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62ecdc18b72a69615d6bd857/qAHhWJbSsmoezFHiErBUT.png","isPro":true,"fullname":"Daniel (Unsloth)","user":"danielhanchen","type":"user"},{"_id":"640eb45dfdeaae139086c107","avatarUrl":"/avatars/4468296a032446b1109cbf79a585858b.svg","isPro":true,"fullname":"Elad Rave","user":"erave02","type":"user"},{"_id":"620783f24e28382272337ba4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/620783f24e28382272337ba4/zkUveQPNiDfYjgGhuFErj.jpeg","isPro":false,"fullname":"GuoLiangTang","user":"Tommy930","type":"user"},{"_id":"6572aa5849719ff0da8a9c8c","avatarUrl":"/avatars/516d1d3aaa0439ab738977787ce9c7d4.svg","isPro":false,"fullname":"CLEMENT L","user":"LVXXX","type":"user"},{"_id":"628c07605f7c5912e46f58f6","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1653344093596-noauth.jpeg","isPro":false,"fullname":"Ilia Sidorenko","user":"noway","type":"user"},{"_id":"6311bca0ae8896941da24e66","avatarUrl":"/avatars/48de64894fc3c9397e26e4d6da3ff537.svg","isPro":false,"fullname":"Fynn KrΓΆger","user":"fynnkroeger","type":"user"},{"_id":"63284f86cbc744f197050300","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63284f86cbc744f197050300/cGbUDe5fn-8A8Jcmz5lre.png","isPro":false,"fullname":"Hoptimizer","user":"bunnycore","type":"user"},{"_id":"658317edd6cc28d6bd53f498","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/658317edd6cc28d6bd53f498/Y2SRjgS_UY_L00eeWbWBq.jpeg","isPro":false,"fullname":"Arthur Thouvenin","user":"athouvenin","type":"user"},{"_id":"64849395c830787e011af5e9","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/-KlTJXknmhqqq3MwSKRXN.jpeg","isPro":false,"fullname":"Matin mollapur","user":"Matinmollapur01","type":"user"},{"_id":"6032802e1f993496bc14d9e3","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6032802e1f993496bc14d9e3/w6hr-DEQot4VVkoyRIBiy.png","isPro":false,"fullname":"Omar Sanseviero","user":"osanseviero","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":3}">
Papers
arxiv:2401.02415

LLaMA Pro: Progressive LLaMA with Block Expansion

Published on Jan 4, 2024
Β· Submitted by
AK
on Jan 5, 2024
#3 Paper of the day
Authors:
,
,
,
,
,

Abstract

A new post-pretraining method using expanded Transformer blocks for Large Language Models improves knowledge without catastrophic forgetting, yielding LLaMA Pro-8.3B that excels in general tasks, programming, and mathematics.

AI-generated summary

Humans generally acquire new skills without compromising the old; however, the opposite holds for Large Language Models (LLMs), e.g., from LLaMA to CodeLLaMA. To this end, we propose a new post-pretraining method for LLMs with an expansion of Transformer blocks. We tune the expanded blocks using only new corpus, efficiently and effectively improving the model's knowledge without catastrophic forgetting. In this paper, we experiment on the corpus of code and math, yielding LLaMA Pro-8.3B, a versatile foundation model initialized from LLaMA2-7B, excelling in general tasks, programming, and mathematics. LLaMA Pro and its instruction-following counterpart (LLaMA Pro-Instruct) achieve advanced performance among various benchmarks, demonstrating superiority over existing open models in the LLaMA family and the immense potential of reasoning and addressing diverse tasks as an intelligent agent. Our findings provide valuable insights into integrating natural and programming languages, laying a solid foundation for developing advanced language agents that operate effectively in various environments.

Community

If the authors ever come on HuggingFace, consider adding arxiv:2308.06103 to the citation list.

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

How LLaMA Pro Revolutionizes AI with Block Expansion

Links πŸ”—:

πŸ‘‰ Subscribe: https://www.youtube.com/@Arxflix
πŸ‘‰ Twitter: https://x.com/arxflix
πŸ‘‰ LMNT (Partner): https://lmnt.com/

By Arxflix
9t4iCUHx_400x400-1.jpg

Sign up or log in to comment

Models citing this paper 83

Browse 83 models citing this paper

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2401.02415 in a dataset README.md to link it from this page.

Spaces citing this paper 2

Collections including this paper 18