Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Paper page - DarwinLM: Evolutionary Structured Pruning of Large Language Models
[go: Go Back, main page]

@Shengkun\n\t great work and a big congratulations on publishing your research paper! It's really interesting paper. While you released your implementation I tried to implement your paper. It can be accessed here: https://github.com/llmsresearch/darwinlm

\n","updatedAt":"2025-02-20T03:38:14.974Z","author":{"_id":"64bacb06f346e6651476780c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64bacb06f346e6651476780c/7g3HFlkKLISrxn2bqhEs5.png","fullname":"Dipkumar Patel","name":"dippatel1994","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":11,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8756551742553711},"editors":["dippatel1994"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/64bacb06f346e6651476780c/7g3HFlkKLISrxn2bqhEs5.png"],"reactions":[{"reaction":"👍","users":["Shengkun"],"count":1}],"isReport":false,"parentCommentId":"67b37b87ff2d95a98b8a1872"}},{"id":"67b88e9b396233c0dc1964bc","author":{"_id":"63e76e2bfdb4097ef65e0745","avatarUrl":"/avatars/6d4d94ab6f44e23437488fd9fed2a383.svg","fullname":"Tang","name":"Shengkun","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":5,"isUserFollowing":false},"createdAt":"2025-02-21T14:32:59.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Hi, our code and weights are all release at https://github.com/IST-DASLab/DarwinLM. Have a try and enjoy.","html":"

Hi, our code and weights are all release at https://github.com/IST-DASLab/DarwinLM. Have a try and enjoy.

\n","updatedAt":"2025-02-21T14:32:59.702Z","author":{"_id":"63e76e2bfdb4097ef65e0745","avatarUrl":"/avatars/6d4d94ab6f44e23437488fd9fed2a383.svg","fullname":"Tang","name":"Shengkun","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":5,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9364176988601685},"editors":["Shengkun"],"editorAvatarUrls":["/avatars/6d4d94ab6f44e23437488fd9fed2a383.svg"],"reactions":[{"reaction":"🔥","users":["pszemraj"],"count":1}],"isReport":false,"parentCommentId":"67b37b87ff2d95a98b8a1872"}},{"id":"67b88ed6f35b801487e1c251","author":{"_id":"63e76e2bfdb4097ef65e0745","avatarUrl":"/avatars/6d4d94ab6f44e23437488fd9fed2a383.svg","fullname":"Tang","name":"Shengkun","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":5,"isUserFollowing":false},"createdAt":"2025-02-21T14:33:58.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"> @Shengkun great work and a big congratulations on publishing your research paper! It's really interesting paper. While you released your implementation I tried to implement your paper. It can be accessed here: https://github.com/llmsresearch/darwinlm\n\nWow! Your implementation looks pretty good. You can also check our implementation and all weights: https://github.com/IST-DASLab/DarwinLM","html":"
\n

\n\n@Shengkun\n\t great work and a big congratulations on publishing your research paper! It's really interesting paper. While you released your implementation I tried to implement your paper. It can be accessed here: https://github.com/llmsresearch/darwinlm

\n
\n

Wow! Your implementation looks pretty good. You can also check our implementation and all weights: https://github.com/IST-DASLab/DarwinLM

\n","updatedAt":"2025-02-21T14:33:58.806Z","author":{"_id":"63e76e2bfdb4097ef65e0745","avatarUrl":"/avatars/6d4d94ab6f44e23437488fd9fed2a383.svg","fullname":"Tang","name":"Shengkun","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":5,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.885027289390564},"editors":["Shengkun"],"editorAvatarUrls":["/avatars/6d4d94ab6f44e23437488fd9fed2a383.svg"],"reactions":[{"reaction":"❤️","users":["dippatel1994"],"count":1}],"isReport":false,"parentCommentId":"67b37b87ff2d95a98b8a1872"}}]},{"id":"67b3e3ba3770f4f5d694a2ff","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false},"createdAt":"2025-02-18T01:34:50.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is an automated message from the [Librarian Bot](https://huggingface.co/librarian-bots). I found the following papers similar to this paper. \n\nThe following papers were recommended by the Semantic Scholar API \n\n* [Adapt-Pruner: Adaptive Structural Pruning for Efficient Small Language Model Training](https://huggingface.co/papers/2502.03460) (2025)\n* [FASP: Fast and Accurate Structured Pruning of Large Language Models](https://huggingface.co/papers/2501.09412) (2025)\n* [ToMoE: Converting Dense Large Language Models to Mixture-of-Experts through Dynamic Structural Pruning](https://huggingface.co/papers/2501.15316) (2025)\n* [Lightweight and Post-Training Structured Pruning for On-Device Large Lanaguage Models](https://huggingface.co/papers/2501.15255) (2025)\n* [Instruction-Following Pruning for Large Language Models](https://huggingface.co/papers/2501.02086) (2025)\n* [FlexiGPT: Pruning and Extending Large Language Models with Low-Rank Weight Sharing](https://huggingface.co/papers/2501.14713) (2025)\n* [MultiPruner: Balanced Structure Removal in Foundation Models](https://huggingface.co/papers/2501.09949) (2025)\n\n\n Please give a thumbs up to this comment if you found it helpful!\n\n If you want recommendations for any Paper on Hugging Face checkout [this](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers) Space\n\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: `@librarian-bot recommend`","html":"

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

\n

The following papers were recommended by the Semantic Scholar API

\n\n

Please give a thumbs up to this comment if you found it helpful!

\n

If you want recommendations for any Paper on Hugging Face checkout this Space

\n

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend

\n","updatedAt":"2025-02-18T01:34:50.570Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7444285750389099},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2502.07780","authors":[{"_id":"67b33f632f3994b7d95b6e77","user":{"_id":"63e76e2bfdb4097ef65e0745","avatarUrl":"/avatars/6d4d94ab6f44e23437488fd9fed2a383.svg","isPro":false,"fullname":"Tang","user":"Shengkun","type":"user"},"name":"Shengkun Tang","status":"claimed_verified","statusLastChangedAt":"2025-02-18T09:32:22.646Z","hidden":false},{"_id":"67b33f632f3994b7d95b6e78","name":"Oliver Sieberling","hidden":false},{"_id":"67b33f632f3994b7d95b6e79","name":"Eldar Kurtic","hidden":false},{"_id":"67b33f632f3994b7d95b6e7a","name":"Zhiqiang Shen","hidden":false},{"_id":"67b33f632f3994b7d95b6e7b","name":"Dan Alistarh","hidden":false}],"publishedAt":"2025-02-11T18:59:35.000Z","submittedOnDailyAt":"2025-02-17T11:24:04.307Z","title":"DarwinLM: Evolutionary Structured Pruning of Large Language Models","submittedOnDailyBy":{"_id":"63e76e2bfdb4097ef65e0745","avatarUrl":"/avatars/6d4d94ab6f44e23437488fd9fed2a383.svg","isPro":false,"fullname":"Tang","user":"Shengkun","type":"user"},"summary":"Large Language Models (LLMs) have achieved significant success across various\nNLP tasks. However, their massive computational costs limit their widespread\nuse, particularly in real-time applications. Structured pruning offers an\neffective solution by compressing models and directly providing end-to-end\nspeed improvements, regardless of the hardware environment. Meanwhile,\ndifferent components of the model exhibit varying sensitivities towards\npruning, calling for non-uniform model compression. However, a pruning\nmethod should not only identify a capable substructure, but also account for\npost-compression training. To this end, we propose \\sysname, a method for\ntraining-aware structured pruning. \\sysname builds upon an evolutionary\nsearch process, generating multiple offspring models in each generation through\nmutation, and selecting the fittest for survival. To assess the effect of\npost-training, we incorporate a lightweight, multistep training process within\nthe offspring population, progressively increasing the number of tokens and\neliminating poorly performing models in each selection stage. We validate our\nmethod through extensive experiments on Llama-2-7B, Llama-3.1-8B and\nQwen-2.5-14B-Instruct, achieving state-of-the-art performance for structured\npruning. For instance, \\sysname surpasses ShearedLlama while requiring\n5times less training data during post-compression training.","upvotes":18,"discussionId":"67b33f642f3994b7d95b6eb1","githubRepo":"https://github.com/IST-DASLab/DarwinLM","githubRepoAddedBy":"auto","ai_summary":"A training-aware structured pruning method using evolutionary search achieves state-of-the-art performance in compressing large language models with reduced training data post-compression.","ai_keywords":["structured pruning","non-uniform model compression","training-aware","evolutionary search","multistep training","post-compression training","large language models","LLaMa-2-7B","LLaMa-3.1-8B","Qwen-2.5-14B-Instruct","ShearedLlama"],"githubStars":21},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"63e76e2bfdb4097ef65e0745","avatarUrl":"/avatars/6d4d94ab6f44e23437488fd9fed2a383.svg","isPro":false,"fullname":"Tang","user":"Shengkun","type":"user"},{"_id":"64edc2681d42449431f58c2f","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64edc2681d42449431f58c2f/kWG8iyO-lSMt-2Vbm9ipq.jpeg","isPro":false,"fullname":"Liqun Ma","user":"LiqunMa","type":"user"},{"_id":"66388c51fa232f0f562ae1f8","avatarUrl":"/avatars/75df361d976c5cb50c2ec54d5770c073.svg","isPro":false,"fullname":"cong zeng","user":"congzeng","type":"user"},{"_id":"672622403a882a0daa060269","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/oBLGd2ONUFFPw8MoLLOrY.png","isPro":false,"fullname":"Bowei Guo","user":"Goudan088","type":"user"},{"_id":"64ef52c2718f94ae8e78a5e7","avatarUrl":"/avatars/d169f4ee62786a3eb4a3fa9d1fec52e9.svg","isPro":false,"fullname":"Alistarh","user":"d-alistarh","type":"user"},{"_id":"646b342a36505117e229affe","avatarUrl":"/avatars/a76fff929dc54cedff9ff8642abd78c4.svg","isPro":false,"fullname":"Jędrzej","user":"YenJJ","type":"user"},{"_id":"60bccec062080d33f875cd0c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/60bccec062080d33f875cd0c/KvEhYxx9-Tff_Qb7PsjAL.png","isPro":true,"fullname":"Peter Szemraj","user":"pszemraj","type":"user"},{"_id":"659ddfb45673a33b5db22d57","avatarUrl":"/avatars/ae1dce603b4cae2659d6070e8ce98b15.svg","isPro":false,"fullname":"Oliver Sieberling","user":"OliverSieberling","type":"user"},{"_id":"6270324ebecab9e2dcf245de","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6270324ebecab9e2dcf245de/cMbtWSasyNlYc9hvsEEzt.jpeg","isPro":false,"fullname":"Kye Gomez","user":"kye","type":"user"},{"_id":"675dd24a2c98629a5e49dfac","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/tI3V8-PZ8d3CC32fzO31e.png","isPro":false,"fullname":"Starstrek","user":"Stars321123","type":"user"},{"_id":"630ae206450a9475bd1af1ce","avatarUrl":"/avatars/2011b0c567d02b7f44573df929706c4c.svg","isPro":false,"fullname":"Eugene Oskin","user":"eoskin","type":"user"},{"_id":"62f88ba247d782a6e28a69e8","avatarUrl":"/avatars/a4d59bd56c75b4f8e0332d35d2fc867a.svg","isPro":false,"fullname":"Taha Ansari","user":"Tahahah","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0}">
Papers
arxiv:2502.07780

DarwinLM: Evolutionary Structured Pruning of Large Language Models

Published on Feb 11, 2025
· Submitted by
Tang
on Feb 17, 2025
Authors:
,
,
,

Abstract

A training-aware structured pruning method using evolutionary search achieves state-of-the-art performance in compressing large language models with reduced training data post-compression.

AI-generated summary

Large Language Models (LLMs) have achieved significant success across various NLP tasks. However, their massive computational costs limit their widespread use, particularly in real-time applications. Structured pruning offers an effective solution by compressing models and directly providing end-to-end speed improvements, regardless of the hardware environment. Meanwhile, different components of the model exhibit varying sensitivities towards pruning, calling for non-uniform model compression. However, a pruning method should not only identify a capable substructure, but also account for post-compression training. To this end, we propose \sysname, a method for training-aware structured pruning. \sysname builds upon an evolutionary search process, generating multiple offspring models in each generation through mutation, and selecting the fittest for survival. To assess the effect of post-training, we incorporate a lightweight, multistep training process within the offspring population, progressively increasing the number of tokens and eliminating poorly performing models in each selection stage. We validate our method through extensive experiments on Llama-2-7B, Llama-3.1-8B and Qwen-2.5-14B-Instruct, achieving state-of-the-art performance for structured pruning. For instance, \sysname surpasses ShearedLlama while requiring 5times less training data during post-compression training.

Community

Paper author Paper submitter

First time submission.

cool paper! any reference code/implementation?

·
Paper author

Thank you for your interest on our work. We will release our code very soon! Will post an update here.

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 6

Browse 6 models citing this paper

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2502.07780 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2502.07780 in a Space README.md to link it from this page.

Collections including this paper 2