Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Channel Merging: Preserving Specialization for Merged Experts (2024)
MergeME: Model Merging Techniques for Homogeneous and Heterogeneous MoEs (2025)
FSMoE: A Flexible and Scalable Training System for Sparse Mixture-of-Experts Models (2025)
FTP: A Fine-grained Token-wise Pruner for Large Language Models via Token Routing (2024)
Merging Models on the Fly Without Retraining: A Sequential Approach to Scalable Continual Model Merging (2025)
ToMoE: Converting Dense Large Language Models to Mixture-of-Experts through Dynamic Structural Pruning (2025)
Training Language Models to Reason Efficiently (2025)

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend

\n","updatedAt":"2025-02-14T01:34:31.330Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7545871734619141},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2502.04411","authors":[{"_id":"67adad972883187d78409a7a","name":"Kunfeng Lai","hidden":false},{"_id":"67adad972883187d78409a7b","name":"Zhenheng Tang","hidden":false},{"_id":"67adad972883187d78409a7c","name":"Xinglin Pan","hidden":false},{"_id":"67adad972883187d78409a7d","user":{"_id":"6395f845aec00abff778ad31","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6395f845aec00abff778ad31/bZkAlchSvqER1HgBKmcHI.jpeg","isPro":false,"fullname":"PeijieDong","user":"pprp","type":"user"},"name":"Peijie Dong","status":"claimed_verified","statusLastChangedAt":"2025-06-02T07:53:30.947Z","hidden":false},{"_id":"67adad972883187d78409a7e","user":{"_id":"63024676056ec3a2a8714b24","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1661093436322-noauth.jpeg","isPro":false,"fullname":"Xiang Liu","user":"Dominic789654","type":"user"},"name":"Xiang Liu","status":"claimed_verified","statusLastChangedAt":"2025-02-13T08:45:17.030Z","hidden":false},{"_id":"67adad972883187d78409a7f","name":"Haolan Chen","hidden":false},{"_id":"67adad972883187d78409a80","name":"Li Shen","hidden":false},{"_id":"67adad972883187d78409a81","name":"Bo Li","hidden":false},{"_id":"67adad972883187d78409a82","name":"Xiaowen Chu","hidden":false}],"publishedAt":"2025-02-06T11:26:30.000Z","submittedOnDailyAt":"2025-02-13T06:00:35.137Z","title":"Mediator: Memory-efficient LLM Merging with Less Parameter Conflicts and\n Uncertainty Based Routing","submittedOnDailyBy":{"_id":"63024676056ec3a2a8714b24","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1661093436322-noauth.jpeg","isPro":false,"fullname":"Xiang Liu","user":"Dominic789654","type":"user"},"summary":"Model merging aggregates Large Language Models (LLMs) finetuned on different\ntasks into a stronger one. However, parameter conflicts between models leads to\nperformance degradation in averaging. While model routing addresses this issue\nby selecting individual models during inference, it imposes excessive storage\nand compute costs, and fails to leverage the common knowledge from different\nmodels. In this work, we observe that different layers exhibit varying levels\nof parameter conflicts. Building on this insight, we average layers with\nminimal parameter conflicts and use a novel task-level expert routing for\nlayers with significant conflicts. To further reduce storage costs, inspired by\ntask arithmetic sparsity, we decouple multiple fine-tuned experts into a dense\nexpert and several sparse experts. Considering the out-of-distribution samples,\nwe select and merge appropriate experts based on the task uncertainty of the\ninput data. We conduct extensive experiments on both LLaMA and Qwen with\nvarying parameter scales, and evaluate on real-world reasoning tasks. Results\ndemonstrate that our method consistently achieves significant performance\nimprovements while requiring less system cost compared to existing methods.","upvotes":4,"discussionId":"67adad992883187d78409aa8","ai_summary":"A method for merging Large Language Models leverages layer-wise averaging and task-level expert routing to improve performance and reduce system cost.","ai_keywords":["Model merging","Large Language Models","LLMs","parameter conflicts","model routing","layer-wise averaging","task-level expert routing","task arithmetic sparsity","dense expert","sparse experts","out-of-distribution samples","task uncertainty","LLaMA","Qwen"]},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"63024676056ec3a2a8714b24","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1661093436322-noauth.jpeg","isPro":false,"fullname":"Xiang Liu","user":"Dominic789654","type":"user"},{"_id":"6395f845aec00abff778ad31","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6395f845aec00abff778ad31/bZkAlchSvqER1HgBKmcHI.jpeg","isPro":false,"fullname":"PeijieDong","user":"pprp","type":"user"},{"_id":"66f612b934b8ac9ffa44f084","avatarUrl":"/avatars/6836c122e19c66c90f1673f28b30d7f0.svg","isPro":false,"fullname":"Tang","user":"tommysally","type":"user"},{"_id":"6640bbd0220cfa8cbfdce080","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6640bbd0220cfa8cbfdce080/wiAHUu5ewawyipNs0YFBR.png","isPro":true,"fullname":"John Smith","user":"John6666","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0}">

Papers

arxiv:2502.04411

Mediator: Memory-efficient LLM Merging with Less Parameter Conflicts and Uncertainty Based Routing

Published on Feb 6, 2025

· Submitted by

Xiang Liu on Feb 13, 2025

Upvote

Authors:

Peijie Dong ,

Xiang Liu ,

Abstract

A method for merging Large Language Models leverages layer-wise averaging and task-level expert routing to improve performance and reduce system cost.

AI-generated summary

Model merging aggregates Large Language Models (LLMs) finetuned on different tasks into a stronger one. However, parameter conflicts between models leads to performance degradation in averaging. While model routing addresses this issue by selecting individual models during inference, it imposes excessive storage and compute costs, and fails to leverage the common knowledge from different models. In this work, we observe that different layers exhibit varying levels of parameter conflicts. Building on this insight, we average layers with minimal parameter conflicts and use a novel task-level expert routing for layers with significant conflicts. To further reduce storage costs, inspired by task arithmetic sparsity, we decouple multiple fine-tuned experts into a dense expert and several sparse experts. Considering the out-of-distribution samples, we select and merge appropriate experts based on the task uncertainty of the input data. We conduct extensive experiments on both LLaMA and Qwen with varying parameter scales, and evaluate on real-world reasoning tasks. Results demonstrate that our method consistently achieves significant performance improvements while requiring less system cost compared to existing methods.

View arXiv page View PDF Add to collection

Community

Dominic789654

Paper author Paper submitter Feb 13, 2025

librarian-bot

Feb 14, 2025

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2502.04411 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2502.04411 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2502.04411 in a Space README.md to link it from this page.