https://huggingface.co/spaces/ameerazam08/InstantStyle-GPU-Demo

\n","updatedAt":"2024-04-05T09:47:13.628Z","author":{"_id":"6266513d539521e602b5dc3a","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6266513d539521e602b5dc3a/7ZU_GyMBzrFHcHDoAkQlp.png","fullname":"Ameer Azam","name":"ameerazam08","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":140,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.6105843186378479},"editors":["ameerazam08"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/6266513d539521e602b5dc3a/7ZU_GyMBzrFHcHDoAkQlp.png"],"reactions":[],"isReport":false}},{"id":"6610748f9e40f2e3baaf2a4b","author":{"_id":"626d1e1e72169e781945bf44","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/626d1e1e72169e781945bf44/VKbdVYvXe9_MdvwTBw8SC.jpeg","fullname":"Abraham Owodunni","name":"Owos","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":4,"isUserFollowing":false},"createdAt":"2024-04-05T22:00:47.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"@ameerazam08 this is really nice. Do you think this technique can work for transferring hairstyle from one picture to another?","html":"

\n\n@ameerazam08\n\t this is really nice. Do you think this technique can work for transferring hairstyle from one picture to another?

\n","updatedAt":"2024-04-05T22:00:47.135Z","author":{"_id":"626d1e1e72169e781945bf44","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/626d1e1e72169e781945bf44/VKbdVYvXe9_MdvwTBw8SC.jpeg","fullname":"Abraham Owodunni","name":"Owos","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":4,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9706931114196777},"editors":["Owos"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/626d1e1e72169e781945bf44/VKbdVYvXe9_MdvwTBw8SC.jpeg"],"reactions":[],"isReport":false}},{"id":"66109c5e74f830bc7dafaab8","author":{"_id":"6266513d539521e602b5dc3a","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6266513d539521e602b5dc3a/7ZU_GyMBzrFHcHDoAkQlp.png","fullname":"Ameer Azam","name":"ameerazam08","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":140,"isUserFollowing":false},"createdAt":"2024-04-06T00:50:38.000Z","type":"comment","data":{"edited":true,"hidden":false,"latest":{"raw":"Not sure on Diffusion model and this not same as e4e or Maintaine the Gap paper (GANs) based this are mostly used for style transfer @Owos ","html":"

Not sure on Diffusion model and this not same as e4e or Maintaine the Gap paper (GANs) based this are mostly used for style transfer \n\n@Owos\n\t

\n","updatedAt":"2024-04-06T00:53:42.272Z","author":{"_id":"6266513d539521e602b5dc3a","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6266513d539521e602b5dc3a/7ZU_GyMBzrFHcHDoAkQlp.png","fullname":"Ameer Azam","name":"ameerazam08","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":140,"isUserFollowing":false}},"numEdits":3,"identifiedLanguage":{"language":"en","probability":0.9996033906936646},"editors":["ameerazam08"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/6266513d539521e602b5dc3a/7ZU_GyMBzrFHcHDoAkQlp.png"],"reactions":[],"isReport":false},"replies":[{"id":"6627b6543cf58c2c8d66885c","author":{"_id":"626d1e1e72169e781945bf44","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/626d1e1e72169e781945bf44/VKbdVYvXe9_MdvwTBw8SC.jpeg","fullname":"Abraham Owodunni","name":"Owos","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":4,"isUserFollowing":false},"createdAt":"2024-04-23T13:23:32.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Thank you for the response!!! @ameerazam08 ","html":"

Thank you for the response!!! \n\n@ameerazam08\n\t

\n","updatedAt":"2024-04-23T13:23:32.359Z","author":{"_id":"626d1e1e72169e781945bf44","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/626d1e1e72169e781945bf44/VKbdVYvXe9_MdvwTBw8SC.jpeg","fullname":"Abraham Owodunni","name":"Owos","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":4,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.6532305479049683},"editors":["Owos"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/626d1e1e72169e781945bf44/VKbdVYvXe9_MdvwTBw8SC.jpeg"],"reactions":[{"reaction":"👍","users":["ameerazam08"],"count":1}],"isReport":false,"parentCommentId":"66109c5e74f830bc7dafaab8"}}]},{"id":"66120e2dda694a41db12951f","author":{"_id":"637745113a63a2983ffbde13","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1669187672174-637745113a63a2983ffbde13.jpeg","fullname":"Haofan Wang","name":"wanghaofan","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":95,"isUserFollowing":false},"createdAt":"2024-04-07T03:08:29.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Github Page: https://github.com/InstantStyle/InstantStyle\n\nProject Page: https://instantstyle.github.io/","html":"

Github Page: https://github.com/InstantStyle/InstantStyle

Project Page: https://instantstyle.github.io/

\n","updatedAt":"2024-04-07T03:08:29.493Z","author":{"_id":"637745113a63a2983ffbde13","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1669187672174-637745113a63a2983ffbde13.jpeg","fullname":"Haofan Wang","name":"wanghaofan","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":95,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.5817548632621765},"editors":["wanghaofan"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1669187672174-637745113a63a2983ffbde13.jpeg"],"reactions":[{"reaction":"🔥","users":["ameerazam08","wanghaofan","Owos","NickMnnq","jbgoodML"],"count":5}],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2404.02733","authors":[{"_id":"660e1590076a6255659f92b0","user":{"_id":"637745113a63a2983ffbde13","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1669187672174-637745113a63a2983ffbde13.jpeg","isPro":false,"fullname":"Haofan Wang","user":"wanghaofan","type":"user"},"name":"Haofan Wang","status":"extracted_confirmed","statusLastChangedAt":"2024-04-25T02:59:37.953Z","hidden":false},{"_id":"660e1590076a6255659f92b1","name":"Qixun Wang","hidden":false},{"_id":"660e1590076a6255659f92b2","name":"Xu Bai","hidden":false},{"_id":"660e1590076a6255659f92b3","name":"Zekui Qin","hidden":false},{"_id":"660e1590076a6255659f92b4","user":{"_id":"6311d9ee04f842f79916158c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/PvOYgDEpe0x5ExLyj1BK-.png","isPro":false,"fullname":"chen","user":"antonio-c","type":"user"},"name":"Anthony Chen","status":"claimed_verified","statusLastChangedAt":"2024-11-05T07:59:03.154Z","hidden":false}],"publishedAt":"2024-04-03T13:34:09.000Z","submittedOnDailyAt":"2024-04-04T01:21:01.929Z","title":"InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image\n Generation","submittedOnDailyBy":{"_id":"60f1abe7544c2adfd699860c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674929746905-60f1abe7544c2adfd699860c.jpeg","isPro":false,"fullname":"AK","user":"akhaliq","type":"user"},"summary":"Tuning-free diffusion-based models have demonstrated significant potential in\nthe realm of image personalization and customization. However, despite this\nnotable progress, current models continue to grapple with several complex\nchallenges in producing style-consistent image generation. Firstly, the concept\nof style is inherently underdetermined, encompassing a multitude of elements\nsuch as color, material, atmosphere, design, and structure, among others.\nSecondly, inversion-based methods are prone to style degradation, often\nresulting in the loss of fine-grained details. Lastly, adapter-based approaches\nfrequently require meticulous weight tuning for each reference image to achieve\na balance between style intensity and text controllability. In this paper, we\ncommence by examining several compelling yet frequently overlooked\nobservations. We then proceed to introduce InstantStyle, a framework designed\nto address these issues through the implementation of two key strategies: 1) A\nstraightforward mechanism that decouples style and content from reference\nimages within the feature space, predicated on the assumption that features\nwithin the same space can be either added to or subtracted from one another. 2)\nThe injection of reference image features exclusively into style-specific\nblocks, thereby preventing style leaks and eschewing the need for cumbersome\nweight tuning, which often characterizes more parameter-heavy designs.Our work\ndemonstrates superior visual stylization outcomes, striking an optimal balance\nbetween the intensity of style and the controllability of textual elements. Our\ncodes will be available at https://github.com/InstantStyle/InstantStyle.","upvotes":22,"discussionId":"660e1595076a6255659f9468","projectPage":"https://instantstyle.github.io/","githubRepo":"https://github.com/instantX-research/InstantStyle","githubRepoAddedBy":"user","ai_summary":"InstantStyle addresses style consistency and detail retention in image generation by decoupling style and content and focusing style-specific feature injection, enhancing stylization without weight tuning.","ai_keywords":["diffusion-based models","image personalization","style-consistent image generation","inversion-based methods","adapter-based approaches","style degradation","fine-grained details","parameter-heavy designs","visual stylization","style-specific blocks","style leaks"],"githubStars":2007},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"655ac762cb17ec19ef82719b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/655ac762cb17ec19ef82719b/1kDncYrGLYS_2SR8cNdAL.png","isPro":false,"fullname":"Welcome to matlok","user":"matlok","type":"user"},{"_id":"63268e321069e50203a86671","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674672951716-63268e321069e50203a86671.jpeg","isPro":true,"fullname":"Ho woo Jang","user":"treksis","type":"user"},{"_id":"620783f24e28382272337ba4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/620783f24e28382272337ba4/zkUveQPNiDfYjgGhuFErj.jpeg","isPro":false,"fullname":"GuoLiangTang","user":"Tommy930","type":"user"},{"_id":"646f3418a6a58aa29505fd30","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/646f3418a6a58aa29505fd30/1z13rnpb6rsUgQsYumWPg.png","isPro":false,"fullname":"QINGHE WANG","user":"Qinghew","type":"user"},{"_id":"637745113a63a2983ffbde13","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1669187672174-637745113a63a2983ffbde13.jpeg","isPro":false,"fullname":"Haofan Wang","user":"wanghaofan","type":"user"},{"_id":"635cada2c017767a629db012","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1667018139063-noauth.jpeg","isPro":false,"fullname":"Ojasvi Singh Yadav","user":"ojasvisingh786","type":"user"},{"_id":"639c751c8a34ed9a404d1627","avatarUrl":"/avatars/ad21cfa15f2ac6dfbbd7ad60c28266f6.svg","isPro":false,"fullname":"Damodharan","user":"damojay","type":"user"},{"_id":"6538119803519fddb4a17e10","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6538119803519fddb4a17e10/ffJMkdx-rM7VvLTCM6ri_.jpeg","isPro":false,"fullname":"samusenps","user":"samusenps","type":"user"},{"_id":"65c04f176388d638108f3ee0","avatarUrl":"/avatars/f0c14cd892d4de9223219068778807b4.svg","isPro":false,"fullname":"CreeperHannibal","user":"CreeperHannibal","type":"user"},{"_id":"638f308fc4444c6ca870b60a","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/638f308fc4444c6ca870b60a/Q11NK-8-JbiilJ-vk2LAR.png","isPro":true,"fullname":"Linoy Tsaban","user":"linoyts","type":"user"},{"_id":"648eb1eb59c4e5c87dc116e0","avatarUrl":"/avatars/c636cea39c2c0937f01398c94ead5dad.svg","isPro":false,"fullname":"fdsqefsgergd","user":"T-representer","type":"user"},{"_id":"6421c1cdeaad1bcb28b0e903","avatarUrl":"/avatars/7c720d0e39536a7e49340052f464a80d.svg","isPro":false,"fullname":"Chenxin Li","user":"XGGNet","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0}">

Papers

arxiv:2404.02733

InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation

Published on Apr 3, 2024

· Submitted by

AK on Apr 4, 2024

Upvote

Authors:

Haofan Wang ,

Anthony Chen

Abstract

InstantStyle addresses style consistency and detail retention in image generation by decoupling style and content and focusing style-specific feature injection, enhancing stylization without weight tuning.

AI-generated summary

Tuning-free diffusion-based models have demonstrated significant potential in the realm of image personalization and customization. However, despite this notable progress, current models continue to grapple with several complex challenges in producing style-consistent image generation. Firstly, the concept of style is inherently underdetermined, encompassing a multitude of elements such as color, material, atmosphere, design, and structure, among others. Secondly, inversion-based methods are prone to style degradation, often resulting in the loss of fine-grained details. Lastly, adapter-based approaches frequently require meticulous weight tuning for each reference image to achieve a balance between style intensity and text controllability. In this paper, we commence by examining several compelling yet frequently overlooked observations. We then proceed to introduce InstantStyle, a framework designed to address these issues through the implementation of two key strategies: 1) A straightforward mechanism that decouples style and content from reference images within the feature space, predicated on the assumption that features within the same space can be either added to or subtracted from one another. 2) The injection of reference image features exclusively into style-specific blocks, thereby preventing style leaks and eschewing the need for cumbersome weight tuning, which often characterizes more parameter-heavy designs.Our work demonstrates superior visual stylization outcomes, striking an optimal balance between the intensity of style and the controllability of textual elements. Our codes will be available at https://github.com/InstantStyle/InstantStyle.