Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Paper page - Mamba-Shedder: Post-Transformer Compression for Efficient Selective Structured State Space Models
[go: Go Back, main page]

Librarian Bot. I found the following papers similar to this paper.

\n

The following papers were recommended by the Semantic Scholar API

\n\n

Please give a thumbs up to this comment if you found it helpful!

\n

If you want recommendations for any Paper on Hugging Face checkout this Space

\n

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend

\n","updatedAt":"2025-03-03T02:46:41.192Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7326251268386841},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2501.17088","authors":[{"_id":"67a24bd2b36eb47b86757972","user":{"_id":"63dece97e742e86dc92169b9","avatarUrl":"/avatars/55e66d1b65e73b0ad1a16ee0457d6af6.svg","isPro":false,"fullname":"J. Pablo Munoz","user":"jpablomch","type":"user"},"name":"J. Pablo Muñoz","status":"claimed_verified","statusLastChangedAt":"2025-02-20T09:37:52.864Z","hidden":false},{"_id":"67a24bd2b36eb47b86757973","user":{"_id":"6541d41ae2e170b6a8de2f78","avatarUrl":"/avatars/f7b524d17910b0e93548d08089d24f60.svg","isPro":false,"fullname":"Jinjie Yuan","user":"jinjieyuan","type":"user"},"name":"Jinjie Yuan","status":"claimed_verified","statusLastChangedAt":"2025-02-13T08:26:10.964Z","hidden":false},{"_id":"67a24bd2b36eb47b86757974","user":{"_id":"62d28f0c8d8206fd4d84ceca","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d28f0c8d8206fd4d84ceca/E9k4-RUCnoAgYinh6Ax2W.png","isPro":false,"fullname":"Nilesh Jain - Founder of Bibby AI","user":"BibbyResearch","type":"user"},"name":"Nilesh Jain","status":"claimed_verified","statusLastChangedAt":"2025-09-26T12:28:37.782Z","hidden":false}],"publishedAt":"2025-01-28T17:22:01.000Z","title":"Mamba-Shedder: Post-Transformer Compression for Efficient Selective\n Structured State Space Models","summary":"Large pre-trained models have achieved outstanding results in sequence\nmodeling. The Transformer block and its attention mechanism have been the main\ndrivers of the success of these models. Recently, alternative architectures,\nsuch as Selective Structured State Space Models (SSMs), have been proposed to\naddress the inefficiencies of Transformers. This paper explores the compression\nof SSM-based models, particularly Mamba and its hybrids. We study the\nsensitivity of these models to the removal of selected components at different\ngranularities to reduce the model size and computational overhead, thus\nimproving their efficiency while maintaining accuracy. The proposed solutions,\ncollectively referred to as Mamba-Shedder, achieve a speedup of up to 1.4x\nduring inference, demonstrating that model efficiency can be improved by\neliminating several redundancies with minimal impact on the overall model\nperformance. The code is available at\nhttps://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning.","upvotes":2,"discussionId":"67a24bd2b36eb47b867579b6","ai_summary":"Mamba-Shedder improves the efficiency of SSM-based models like Mamba by reducing redundant components without significantly affecting performance.","ai_keywords":["Transformer block","attention mechanism","Selective Structured State Space Models","SSM","Mamba","Mamba-Shedder","inference speedup"]},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"6541d41ae2e170b6a8de2f78","avatarUrl":"/avatars/f7b524d17910b0e93548d08089d24f60.svg","isPro":false,"fullname":"Jinjie Yuan","user":"jinjieyuan","type":"user"},{"_id":"63dece97e742e86dc92169b9","avatarUrl":"/avatars/55e66d1b65e73b0ad1a16ee0457d6af6.svg","isPro":false,"fullname":"J. Pablo Munoz","user":"jpablomch","type":"user"}],"acceptLanguages":["*"]}">
Papers
arxiv:2501.17088

Mamba-Shedder: Post-Transformer Compression for Efficient Selective Structured State Space Models

Published on Jan 28, 2025

Abstract

Mamba-Shedder improves the efficiency of SSM-based models like Mamba by reducing redundant components without significantly affecting performance.

AI-generated summary

Large pre-trained models have achieved outstanding results in sequence modeling. The Transformer block and its attention mechanism have been the main drivers of the success of these models. Recently, alternative architectures, such as Selective Structured State Space Models (SSMs), have been proposed to address the inefficiencies of Transformers. This paper explores the compression of SSM-based models, particularly Mamba and its hybrids. We study the sensitivity of these models to the removal of selected components at different granularities to reduce the model size and computational overhead, thus improving their efficiency while maintaining accuracy. The proposed solutions, collectively referred to as Mamba-Shedder, achieve a speedup of up to 1.4x during inference, demonstrating that model efficiency can be improved by eliminating several redundancies with minimal impact on the overall model performance. The code is available at https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning.

Community

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 3

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2501.17088 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2501.17088 in a Space README.md to link it from this page.

Collections including this paper 1