Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456 Paper page - Effective Quantization for Diffusion Models on CPUs
https://github.com/intel/intel-extension-for-transformers\n","updatedAt":"2023-11-29T07:05:22.052Z","author":{"_id":"60ac3318e3de7c7440abb850","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/60ac3318e3de7c7440abb850/DkMrPBr6Ew_dS_c9kog4e.jpeg","fullname":"Haihao Shen","name":"Haihao","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":29,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.6216446161270142},"editors":["Haihao"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/60ac3318e3de7c7440abb850/DkMrPBr6Ew_dS_c9kog4e.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2311.16133","authors":[{"_id":"6566e2551cd377ae60ed6d80","name":"Hanwen Chang","hidden":false},{"_id":"6566e2551cd377ae60ed6d81","user":{"_id":"60ac3318e3de7c7440abb850","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/60ac3318e3de7c7440abb850/DkMrPBr6Ew_dS_c9kog4e.jpeg","isPro":false,"fullname":"Haihao Shen","user":"Haihao","type":"user"},"name":"Haihao Shen","status":"claimed_verified","statusLastChangedAt":"2024-10-10T08:03:43.173Z","hidden":false},{"_id":"6566e2551cd377ae60ed6d82","name":"Yiyang Cai","hidden":false},{"_id":"6566e2551cd377ae60ed6d83","name":"Xinyu Ye","hidden":false},{"_id":"6566e2551cd377ae60ed6d84","name":"Zhenzhong Xu","hidden":false},{"_id":"6566e2551cd377ae60ed6d85","name":"Wenhua Cheng","hidden":false},{"_id":"6566e2551cd377ae60ed6d86","user":{"_id":"630881ff6fb2ea4413f16390","avatarUrl":"/avatars/8cdf266604b24bc9ae599aa2def8debd.svg","isPro":false,"fullname":"lvkaokao","user":"lvkaokao","type":"user"},"name":"Kaokao Lv","status":"claimed_verified","statusLastChangedAt":"2023-11-30T09:30:22.658Z","hidden":false},{"_id":"6566e2551cd377ae60ed6d87","name":"Weiwei Zhang","hidden":false},{"_id":"6566e2551cd377ae60ed6d88","name":"Yintong Lu","hidden":false},{"_id":"6566e2551cd377ae60ed6d89","name":"Heng Guo","hidden":false}],"publishedAt":"2023-11-02T13:14:01.000Z","title":"Effective Quantization for Diffusion Models on CPUs","summary":"Diffusion models have gained popularity for generating images from textual\ndescriptions. Nonetheless, the substantial need for computational resources\ncontinues to present a noteworthy challenge, contributing to time-consuming\nprocesses. Quantization, a technique employed to compress deep learning models\nfor enhanced efficiency, presents challenges when applied to diffusion models.\nThese models are notably more sensitive to quantization compared to other model\ntypes, potentially resulting in a degradation of image quality. In this paper,\nwe introduce a novel approach to quantize the diffusion models by leveraging\nboth quantization-aware training and distillation. Our results show the\nquantized models can maintain the high image quality while demonstrating the\ninference efficiency on CPUs.","upvotes":4,"discussionId":"6566e2561cd377ae60ed6d97","ai_summary":"A novel quantization method using quantization-aware training and distillation improves inference efficiency on CPUs for diffusion models without significant quality loss.","ai_keywords":["diffusion models","quantization","quantization-aware training","distillation","inference efficiency","CPUs"]},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"64747f7e33192631bacd8831","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64747f7e33192631bacd8831/dstkZJ4sHJSeqLesV5cOC.jpeg","isPro":false,"fullname":"Taufiq Dwi Purnomo","user":"taufiqdp","type":"user"},{"_id":"63ce875d199b36f7552d4f07","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63ce875d199b36f7552d4f07/bpUrvhXDagzRqZ3vxTcSF.jpeg","isPro":false,"fullname":"Marc Sun","user":"marcsun13","type":"user"},{"_id":"624c60cba8ec93a7ac188b56","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1651743336129-624c60cba8ec93a7ac188b56.png","isPro":false,"fullname":"Félix Marty","user":"fxmarty","type":"user"},{"_id":"60ac3318e3de7c7440abb850","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/60ac3318e3de7c7440abb850/DkMrPBr6Ew_dS_c9kog4e.jpeg","isPro":false,"fullname":"Haihao Shen","user":"Haihao","type":"user"}],"acceptLanguages":["*"]}">
A novel quantization method using quantization-aware training and distillation improves inference efficiency on CPUs for diffusion models without significant quality loss.
AI-generated summary
Diffusion models have gained popularity for generating images from textual
descriptions. Nonetheless, the substantial need for computational resources
continues to present a noteworthy challenge, contributing to time-consuming
processes. Quantization, a technique employed to compress deep learning models
for enhanced efficiency, presents challenges when applied to diffusion models.
These models are notably more sensitive to quantization compared to other model
types, potentially resulting in a degradation of image quality. In this paper,
we introduce a novel approach to quantize the diffusion models by leveraging
both quantization-aware training and distillation. Our results show the
quantized models can maintain the high image quality while demonstrating the
inference efficiency on CPUs.