Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Proceedings of Machine Learning Research | Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop Held in Vancouver, British Columbia, Canada on 14 December 2024 Published as Volume 262 by the Proceedings of Machine Learning Research on 10 December 2024. Volume Edited by: Mehdi Rezagholizadeh Peyman Passban Soheila Samiee Vahid Partovi Nia Yu Cheng Yue Deng Qun Liu Boxing Chen Series Editors: Neil D. Lawrence
[go: Go Back, main page]

[edit]

Volume 262: NeurIPS Efficient Natural Language and Speech Processing Workshop, 14 December 2024, Vancouver, British Columbia, Canada

[edit]

Editors: Mehdi Rezagholizadeh, Peyman Passban, Soheila Samiee, Vahid Partovi Nia, Yu Cheng, Yue Deng, Qun Liu, Boxing Chen

[bib][citeproc]

Training

Scaling Smart: Accelerating Large Language Model Pre-Training with Small Model Initialization

Mohammad Samragh, Seyed Iman Mirzadeh, Keivan Alizadeh-Vahid, Fartash Faghri, Minsik Cho, Moin Nabi, Devang Naik, Mehrdad Farajtabar; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:1-13

Computational Bottlenecks of Training Small-scale Large Language Models

Saleh Ashkboos, Seyed Iman Mirzadeh, Keivan Alizadeh-Vahid, Mohammad Hossein Sekhavat, Moin Nabi, Mehrdad Farajtabar, Fartash Faghri; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:14-21

QuAILoRA: Quantization-Aware Initialization for LoRA

Neal G Lawton, Aishwarya Padmakumar, Judith Gaspers, Jack FitzGerald, Anoop Kumar, Greg Ver Steeg, Aram Galstyan; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:22-33

SuperPos-Prompt: Enhancing Soft Prompt Tuning of Language Models with Superposition of Multi Token Embeddings

Mohammad Ali Sadraei Javaheri, Ehsaneddin Asgari, Alice C. McHardy, Hamid R. Rabiee; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:34-46

RGP: Achieving Memory-Efficient Model Fine-tuning Via Randomized Gradient Projection

Ali Saheb Pasand, Pouya Bashivan; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:47-54

Efficient Alignment of Large Language Models via Data Sampling

Amrit Khera, Rajat Ghosh, Debojyoti Dutta; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:55-72

KD-LoRA: A Hybrid Approach to Efficient Fine-Tuning with LoRA and Knowledge Distillation

Rambod Azimi, Rishav Rishav, Marek Teichmann, Samira Ebrahimi Kahou; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:73-80

Model Design \& Architecture

Dense Backpropagation Improves Routing for Sparsely-Gated Mixture-of-Experts

Ashwinee Panda, Vatsal Baherwani, Zain Sarwar, Benjamin Therien, Sambit Sahu, Stephen Rawls, Supriyo Chakraborty, Tom Goldstein; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:81-101

VL-Mamba: Exploring State Space Models for Multimodal Learning

Yanyuan Qiao, Zheng Yu, Zijia Zhao, Sihan Chen, Mingzhen Sun, Longteng Guo, Qi Wu, Jing Liu; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:102-113

MisD-MoE: A Multimodal Misinformation Detection Framework with Adaptive Feature Selection

Moyang Liu, Kaiying Yan, Yukun Liu, Ruibo Fu, Zhengqi Wen, Xuefei Liu, Chenxing Li; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:114-122

Zipper: A Multi-Tower Decoder Architecture for Fusing Modalities

Vicky Zayats, Peter Chen, Melissa Ferrari, Dirk Padfield; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:123-135

Is 3D Convolution with 5D Tensors Really Necessary for Video Analysis?

Habib Hajimolahoseini, Walid Ahmed, Shuangyue Wen, Yang Liu; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:136-144

Beyond Parameter Count: Implicit Bias in Soft Mixture of Experts

Youngseog Chung, Dhruv Malik, Jeff Schneider, Yuanzhi Li, Aarti Singh; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:145-164

Revisiting SMoE Language Models by Evaluating Inefficiencies with Task Specific Expert Pruning

Soumajyoti Sarkar, Leonard Lausen, Volkan Cevher, Thomas Brox, Sheng Zha, George Karypis; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:165-181

StructMoE: Structured Mixture of Experts Using Low Rank Experts

Zain Sarwar, Ashwinee Panda, Benjamin Thérien, Stephen Rawls, Anirban Das, Kartik Balasubramaniam, Berkcan Kapusuzoglu, Shixiong Zhang, Sambit Sahu, Milind Naphade, Supriyo Chakraborty; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:182-193

Sparse Upcycling: Inference Inefficient Finetuning

Sasha Doubov, Nikhil Sardana, Vitaliy Chiley; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:194-205

Model Efficiency \& Compression

Post-Training Statistical Calibration for Higher Activation Sparsity

Vui Seng Chua, Yujie Pan, Nilesh Jain; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:206-221

Accelerating the Low-Rank Decomposed Models

Habib Hajimolahoseini, Walid Ahmed, Shuangyue Wen, Yang Liu; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:222-231

The EarlyBird Gets the WORM: Heuristically Accelerating EarlyBird Convergence

Adithya G Vasudev; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:232-240

Post Training Quantization of Large Language Models with Microscaling Formats

Sayeh Sharify, Utkarsh Saxena, Zifei Xu, Wanzin Yazar, Ilya Soloveychik, Xin Wang; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:241-258

EchoAtt: Attend, Copy, then Adjust for More Efficient Large Language Models

Hossein Rajabzadeh, Aref Jafari, Aman Sharma, Benyamin Jami, Hyock Ju Hj Kwon, Ali Ghodsi, Boxing Chen, Mehdi Rezagholizadeh; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:259-269

Scaling laws for post-training quantized large language models

Zifei Xu, Alexander Y Lan, Wanzin Yazar, Tristan Webb, Sayeh Sharify, Xin Wang; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:270-285

Partially Shared Query-Key for Lightweight Language Models

Kai Yang, Vahid Partovi Nia, Boxing Chen, Masoud Asgharian; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:286-291

Inference

Snakes and Ladders: Accelerating SSM Inference with Speculative Decoding

Yangchao Wu, Yonatan Dukler, Matthew Trager, Alessandro Achille, Wei Xia, Stefano Soatto; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:292-304

GEAR: An Efficient Error Reduction Framework for KV Cache Compression in LLM Inference

Hao Kang, Qingru Zhang, Souvik Kundu, Geonhwa Jeong, Zaoxing Liu, Tushar Krishna, Tuo Zhao; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:305-321

The N-Grammys: Accelerating Autoregressive Inference with Learning-Free Batched Speculation

Lawrence Stewart, Matthew Trager, Sujan Gonugondla, Stefano Soatto; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:322-335

Distributed Speculative Inference of Large Language Models is Provably Faster

Nadav Timor, Jonathan Mamou, Oren Pereg, Moshe Berchansky, Daniel Korat, Moshe Wasserblat, Tomer Galanti, Michal Gordon, David Harel; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:336-354

AdaEDL: Early Draft Stopping for Speculative Decoding of Large Language Models via an Entropy-based Lower Bound on Token Acceptance Probability

Sudhanshu Agrawal, Wonseok Jeon, Mingu Lee; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:355-369

Inference-Friendly Models With MixAttention

Shashank Rajput, Ying Sheng, Sean Owen, Vitaliy Chiley; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:370-381

Improving Multi-candidate Speculative Decoding

XiaoFan Lu, Yixiao Zeng, Marco Levorato, FeiYang Ma, ZiXu Yu; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:382-394

Speculative Streaming: Fast LLM Inference without Auxiliary Models

Nikhil Bhendawade, Irina Belousova, Qichen Fu, Henry Mason, Mohammad Rastegari, Mahyar Najibi; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:395-413

Hysteresis Activation Function for Efficient Inference

Moshe Kimhi, Idan Kashani, Chaim Baskin, Avi Mendelson; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:414-422

Efficiently Dispatching Flash Attention For Partially Filled Attention Masks

Agniv Sharma, Jonas A. Geiping; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:423-442

Duo-LLM: A Framework for Studying Adaptive Computation in Large Language Models

Keivan Alizadeh-Vahid, Seyed Iman Mirzadeh, Hooman Shahrkokhi, Dmitry Belenko, Frank Sun, Minsik Cho, Mohammad Hossein Sekhavat, Moin Nabi, Mehrdad Farajtabar; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:443-455

Dynamic Speculation Lookahead Accelerates Speculative Decoding of Large Language Models

Jonathan Mamou, Oren Pereg, Daniel Korat, Moshe Berchansky, Nadav Timor, Moshe Wasserblat, Roy Schwartz; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:456-467

CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios

Luning Wang, Shiyao Li, Xuefei Ning, Zhihang Yuan, Shengen Yan, Guohao Dai, Yu Wang; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:468-484

Residual vector quantization for KV cache compression in large language model

Ankur Kumar; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:485-490

Benchmark \& Evaluation

Rephrasing natural text data with different languages and quality levels for Large Language Model pre-training

Michael Pieler, Marco Bellagente, Hannah Teufel, Duy Phung, Nathan Cooper, Jonathan Tow, Paulo Rocha, Reshinth Adithyan, Zaid Alyafeai, Nikhil Pinnaparaju, Maksym Zhuravinskyi, Carlos Riquelme; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:491-511

ChemTEB: Chemical Text Embedding Benchmark, an Overview of Embedding Models Performance & Efficiency on a Specific Domain

Ali Shiraee Kasmaee, Mohammad Khodadad, Mohammad Arshi Saloot, Nick Sherck, Stephen Dokas, Hamidreza Mahyar, Soheila Samiee; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:512-531

On the Efficiency of NLP-Inspired Methods for Tabular Deep Learning

Anton F Thielmann, Soheila Samiee; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:532-539

Applications

Text Summarization With Graph Attention Networks

Mohammadreza Ardestani, Yllias Chali; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:540-553

Less is Enough: Adapting Pre-trained Vision Transformers for Audio-Visual Speaker Verification

Gnana Praveen Rajasekhar, Jahangir Alam; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:554-563

Enhanced label noise robustness through early adaptive filtering for the self-supervised speaker verification task

Abderrahim Fathan, Xiaolin Zhu, Jahangir Alam; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:564-575

Mai Ho‘omāuna i ka ‘Ai: Language Models Improve Automatic Speech Recognition in Hawaiian

Kaavya D Chaparala, Guido Zarrella, Bruce Torres Fischer, Larry Kimura, Oiwi Parker Jones; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:576-583

Lightweight Neural Networks for Speech Emotion Recognition using Layer-wise Adaptive Quantization

Tushar Shinde, Ritika Jain, Avinash Kumar Sharma; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:584-595

OnlySportsLM: Optimizing Sports-Domain Language Models with SOTA Performance under Billion Parameters

Zexin Chen, Chengxi Li, Xiangyu Xie, Parijat Dube; Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:596-610

subscribe via RSS