Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456 Paper page - Conditional Diffusion Distillation
https://fast-codi.github.io/\n","updatedAt":"2024-02-20T00:08:47.384Z","author":{"_id":"651c262b497e4d326780c18a","avatarUrl":"/avatars/65d50faaa22c5bbd982b5652a530da2c.svg","fullname":"Mauricio Delbracio","name":"mdelbra","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":1,"identifiedLanguage":{"language":"en","probability":0.2422223836183548},"editors":["mdelbra"],"editorAvatarUrls":["/avatars/65d50faaa22c5bbd982b5652a530da2c.svg"],"reactions":[{"reaction":"👍","users":["MKFMIKU"],"count":1},{"reaction":"❤️","users":["MKFMIKU"],"count":1},{"reaction":"🤗","users":["MKFMIKU"],"count":1}],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2310.01407","authors":[{"_id":"651b94c33265a1bb83a0aac9","user":{"_id":"635c1f1b37c6a2c12e335604","avatarUrl":"/avatars/edc215a20af0ec5be83721fb4abc1ab5.svg","isPro":false,"fullname":"Kangfu Mei","user":"MKFMIKU","type":"user"},"name":"Kangfu Mei","status":"admin_assigned","statusLastChangedAt":"2023-10-03T11:31:56.284Z","hidden":false},{"_id":"651b94c33265a1bb83a0aaca","user":{"_id":"651c262b497e4d326780c18a","avatarUrl":"/avatars/65d50faaa22c5bbd982b5652a530da2c.svg","isPro":false,"fullname":"Mauricio Delbracio","user":"mdelbra","type":"user"},"name":"Mauricio Delbracio","status":"claimed_verified","statusLastChangedAt":"2023-10-03T15:06:35.900Z","hidden":false},{"_id":"651b94c33265a1bb83a0aacb","name":"Hossein Talebi","hidden":false},{"_id":"651b94c33265a1bb83a0aacc","user":{"_id":"62548d5fef3debb2ddf91217","avatarUrl":"/avatars/14975b45568f9c399c92c3986b6ce83e.svg","isPro":false,"fullname":"Zhengzhong Tu","user":"vztu","type":"user"},"name":"Zhengzhong Tu","status":"admin_assigned","statusLastChangedAt":"2023-10-03T11:32:13.114Z","hidden":false},{"_id":"651b94c33265a1bb83a0aacd","name":"Vishal M. Patel","hidden":false},{"_id":"651b94c33265a1bb83a0aace","name":"Peyman Milanfar","hidden":false}],"publishedAt":"2023-10-02T17:59:18.000Z","submittedOnDailyAt":"2023-10-03T02:43:00.462Z","title":"Conditional Diffusion Distillation","submittedOnDailyBy":{"_id":"60f1abe7544c2adfd699860c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674929746905-60f1abe7544c2adfd699860c.jpeg","isPro":false,"fullname":"AK","user":"akhaliq","type":"user"},"summary":"Generative diffusion models provide strong priors for text-to-image\ngeneration and thereby serve as a foundation for conditional generation tasks\nsuch as image editing, restoration, and super-resolution. However, one major\nlimitation of diffusion models is their slow sampling time. To address this\nchallenge, we present a novel conditional distillation method designed to\nsupplement the diffusion priors with the help of image conditions, allowing for\nconditional sampling with very few steps. We directly distill the unconditional\npre-training in a single stage through joint-learning, largely simplifying the\nprevious two-stage procedures that involve both distillation and conditional\nfinetuning separately. Furthermore, our method enables a new\nparameter-efficient distillation mechanism that distills each task with only a\nsmall number of additional parameters combined with the shared frozen\nunconditional backbone. Experiments across multiple tasks including\nsuper-resolution, image editing, and depth-to-image generation demonstrate that\nour method outperforms existing distillation techniques for the same sampling\ntime. Notably, our method is the first distillation strategy that can match the\nperformance of the much slower fine-tuned conditional diffusion models.","upvotes":20,"discussionId":"651b94cc3265a1bb83a0ac8f","githubRepo":"https://github.com/fast-codi/CoDi","githubRepoAddedBy":"auto","ai_summary":"A novel single-stage distillation method for generative diffusion models reduces sampling time while maintaining performance across tasks like super-resolution and image editing.","ai_keywords":["generative diffusion models","text-to-image generation","conditional generation tasks","image editing","restoration","super-resolution","diffusion priors","conditional sampling","parameter-efficient distillation","unconditional pre-training","joint-learning","fine-tuned conditional diffusion models"],"githubStars":100},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"635c1f1b37c6a2c12e335604","avatarUrl":"/avatars/edc215a20af0ec5be83721fb4abc1ab5.svg","isPro":false,"fullname":"Kangfu Mei","user":"MKFMIKU","type":"user"},{"_id":"609f6be41b2369e005b15631","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1621060485266-noauth.jpeg","isPro":false,"fullname":"Jeremy Dombrowski","user":"meatflavourdev","type":"user"},{"_id":"5dd96eb166059660ed1ee413","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/5dd96eb166059660ed1ee413/NQtzmrDdbG0H8qkZvRyGk.jpeg","isPro":true,"fullname":"Julien Chaumond","user":"julien-c","type":"user"},{"_id":"63efcd9d4a788ed1dd84da9c","avatarUrl":"/avatars/4e518bed1412829768e366fe02c37df8.svg","isPro":false,"fullname":"CB","user":"afterveil","type":"user"},{"_id":"642c029bc3694d2b7454f093","avatarUrl":"/avatars/19c5adaca4feebe1b69b793610827174.svg","isPro":false,"fullname":"Alisamar Husain","user":"zrthxn","type":"user"},{"_id":"6032802e1f993496bc14d9e3","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6032802e1f993496bc14d9e3/w6hr-DEQot4VVkoyRIBiy.png","isPro":false,"fullname":"Omar Sanseviero","user":"osanseviero","type":"user"},{"_id":"648eb1eb59c4e5c87dc116e0","avatarUrl":"/avatars/c636cea39c2c0937f01398c94ead5dad.svg","isPro":false,"fullname":"fdsqefsgergd","user":"T-representer","type":"user"},{"_id":"60c8d264224e250fb0178f77","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/60c8d264224e250fb0178f77/i8fbkBVcoFeJRmkQ9kYAE.png","isPro":false,"fullname":"Adam Lee","user":"Abecid","type":"user"},{"_id":"623c636949b6a399ee11152e","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/623c636949b6a399ee11152e/s_58Qr4gM-ZdLd1cegTQO.png","isPro":false,"fullname":"Gyanateet Dutta","user":"Ryukijano","type":"user"},{"_id":"63e8732dfdb4097ef6696209","avatarUrl":"/avatars/0b240f8cc221efcad3c61850073a4576.svg","isPro":false,"fullname":"Zhongrui Wang","user":"zhongruiwang","type":"user"},{"_id":"64d0d6d80b71aea8be8759e8","avatarUrl":"/avatars/2aad898b34a940a6aa4368526aa20d84.svg","isPro":false,"fullname":"Yoonjae Jeong","user":"hybris75","type":"user"},{"_id":"63d6ffc644f1d8fbe5901f7d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63d6ffc644f1d8fbe5901f7d/oHXMSjp8Oi5ROSBU919hc.jpeg","isPro":false,"fullname":"Armando Fortes","user":"atfortes","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":3}">
A novel single-stage distillation method for generative diffusion models reduces sampling time while maintaining performance across tasks like super-resolution and image editing.
AI-generated summary
Generative diffusion models provide strong priors for text-to-image
generation and thereby serve as a foundation for conditional generation tasks
such as image editing, restoration, and super-resolution. However, one major
limitation of diffusion models is their slow sampling time. To address this
challenge, we present a novel conditional distillation method designed to
supplement the diffusion priors with the help of image conditions, allowing for
conditional sampling with very few steps. We directly distill the unconditional
pre-training in a single stage through joint-learning, largely simplifying the
previous two-stage procedures that involve both distillation and conditional
finetuning separately. Furthermore, our method enables a new
parameter-efficient distillation mechanism that distills each task with only a
small number of additional parameters combined with the shared frozen
unconditional backbone. Experiments across multiple tasks including
super-resolution, image editing, and depth-to-image generation demonstrate that
our method outperforms existing distillation techniques for the same sampling
time. Notably, our method is the first distillation strategy that can match the
performance of the much slower fine-tuned conditional diffusion models.
The paper presents a novel conditional distillation method to distill an unconditional diffusion model into a conditional one for faster sampling while maintaining high image quality.
Insights
A new single-stage distillation approach can distill unconditional diffusion models into conditional ones, simplifying previous two-stage procedures.
Jointly optimizing for noise prediction consistency and conditional signal prediction enables replicating diffusion priors with very few sampling steps.
The proposed PREv predictor for z_hat uses original noise and improves over DDIM sampling.
Conditional guidance loss dx is important to avoid bad local minima during distillation.
The method enables parameter-efficient distillation by freezing most parameters and only training task-specific adapters.
The distilled model matches performance of much slower fine-tuned conditional diffusion models.
Results The proposed conditional diffusion distillation method achieves state-of-the-art image quality with 4 sampling steps, outperforming previous distillation techniques and matching fine-tuned conditional diffusion models that require 50x more sampling steps.