Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Paper page - QuZO: Quantized Zeroth-Order Fine-Tuning for Large Language Models
[go: Go Back, main page]

Papers
arxiv:2502.12346

QuZO: Quantized Zeroth-Order Fine-Tuning for Large Language Models

Published on Feb 17, 2025
Authors:
,
,
,
,
,
,
,
,

Abstract

The Quantized Zeroth-Order (QuZO) framework fine-tunes low-precision language models using optimized stochastic rounding, reducing memory costs while maintaining or improving performance.

Language Models (LLMs) are often quantized to lower precision to reduce the memory cost and latency in inference. However, quantization often degrades model performance, thus fine-tuning is required for various down-stream tasks. Traditional fine-tuning methods such as stochastic gradient descent and Adam optimization require backpropagation, which are error-prone in the low-precision settings. To overcome these limitations, we propose the Quantized Zeroth-Order (QuZO) framework, specifically designed for fine-tuning LLMs through low-precision (e.g., 4- or 8-bit) forward passes. Our method can avoid the error-prone low-precision straight-through estimator, and utilizes optimized stochastic rounding to mitigate the increased bias. QuZO simplifies the training process, while achieving results comparable to first-order methods in {rm FP}8 and superior accuracy in {rm INT}8 and {rm INT}4 training. Experiments demonstrate that low-bit training QuZO achieves performance comparable to MeZO optimization on GLUE, Multi-Choice, and Generation tasks, while reducing memory cost by 2.94 times in LLaMA2-7B fine-tuning compared to quantized first-order methods.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2502.12346
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 2

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2502.12346 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2502.12346 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.