Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456 Paper page - Editing Implicit Assumptions in Text-to-Image Diffusion Models
\n","updatedAt":"2023-11-03T09:48:29.317Z","author":{"_id":"653d8a1b9e84d1e8b66806b0","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/uUtMvS2jwWWFFKcp_YXjI.png","fullname":"Alvarado OM","name":"Oscaralvaros","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.28746289014816284},"editors":["Oscaralvaros"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/uUtMvS2jwWWFFKcp_YXjI.png"],"reactions":[],"isReport":false}},{"id":"656cb62190d556ffa60f60f9","author":{"_id":"655fc76adef5905d38b2cc3f","avatarUrl":"/avatars/9c65221195ec17d2d047d9954e7c8170.svg","fullname":"Antonio Roberto","name":"robertanto","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2023-12-03T17:08:49.000Z","type":"comment","data":{"edited":true,"hidden":true,"hiddenBy":"","latest":{"raw":"This comment has been hidden","html":"This comment has been hidden","updatedAt":"2023-12-03T17:09:15.070Z","author":{"_id":"655fc76adef5905d38b2cc3f","avatarUrl":"/avatars/9c65221195ec17d2d047d9954e7c8170.svg","fullname":"Antonio Roberto","name":"robertanto","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":1,"editors":[],"editorAvatarUrls":[],"reactions":[]}}],"primaryEmailConfirmed":false,"paper":{"id":"2303.08084","authors":[{"_id":"64c87c248b1d0044b90d6037","user":{"_id":"60f82853c53e95176a7c6d45","avatarUrl":"/avatars/16f6ea944a014af6ebe60499f3460784.svg","isPro":false,"fullname":"Hadas Orgad","user":"hadasor","type":"user"},"name":"Hadas Orgad","status":"claimed_verified","statusLastChangedAt":"2023-09-19T15:54:21.918Z","hidden":false},{"_id":"64c87c248b1d0044b90d6038","name":"Bahjat Kawar","hidden":false},{"_id":"64c87c248b1d0044b90d6039","name":"Yonatan Belinkov","hidden":false}],"publishedAt":"2023-03-14T17:14:21.000Z","title":"Editing Implicit Assumptions in Text-to-Image Diffusion Models","summary":"Text-to-image diffusion models often make implicit assumptions about the\nworld when generating images. While some assumptions are useful (e.g., the sky\nis blue), they can also be outdated, incorrect, or reflective of social biases\npresent in the training data. Thus, there is a need to control these\nassumptions without requiring explicit user input or costly re-training. In\nthis work, we aim to edit a given implicit assumption in a pre-trained\ndiffusion model. Our Text-to-Image Model Editing method, TIME for short,\nreceives a pair of inputs: a \"source\" under-specified prompt for which the\nmodel makes an implicit assumption (e.g., \"a pack of roses\"), and a\n\"destination\" prompt that describes the same setting, but with a specified\ndesired attribute (e.g., \"a pack of blue roses\"). TIME then updates the model's\ncross-attention layers, as these layers assign visual meaning to textual\ntokens. We edit the projection matrices in these layers such that the source\nprompt is projected close to the destination prompt. Our method is highly\nefficient, as it modifies a mere 2.2% of the model's parameters in under one\nsecond. To evaluate model editing approaches, we introduce TIMED (TIME\nDataset), containing 147 source and destination prompt pairs from various\ndomains. Our experiments (using Stable Diffusion) show that TIME is successful\nin model editing, generalizes well for related prompts unseen during editing,\nand imposes minimal effect on unrelated generations.","upvotes":2,"discussionId":"64c87c268b1d0044b90d607d","githubRepo":"https://github.com/bahjat-kawar/time-diffusion","githubRepoAddedBy":"auto","ai_summary":"A method updates pre-trained text-to-image diffusion models to modify implicit assumptions without retraining by editing cross-attention layers.","ai_keywords":["text-to-image diffusion models","implicit assumptions","cross-attention layers","projection matrices","Stable Diffusion","TIMED (TIME Dataset)"],"githubStars":87},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"653d8a1b9e84d1e8b66806b0","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/uUtMvS2jwWWFFKcp_YXjI.png","isPro":false,"fullname":"Alvarado OM","user":"Oscaralvaros","type":"user"},{"_id":"6538119803519fddb4a17e10","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6538119803519fddb4a17e10/ffJMkdx-rM7VvLTCM6ri_.jpeg","isPro":false,"fullname":"samusenps","user":"samusenps","type":"user"}],"acceptLanguages":["*"]}">
A method updates pre-trained text-to-image diffusion models to modify implicit assumptions without retraining by editing cross-attention layers.
AI-generated summary
Text-to-image diffusion models often make implicit assumptions about the
world when generating images. While some assumptions are useful (e.g., the sky
is blue), they can also be outdated, incorrect, or reflective of social biases
present in the training data. Thus, there is a need to control these
assumptions without requiring explicit user input or costly re-training. In
this work, we aim to edit a given implicit assumption in a pre-trained
diffusion model. Our Text-to-Image Model Editing method, TIME for short,
receives a pair of inputs: a "source" under-specified prompt for which the
model makes an implicit assumption (e.g., "a pack of roses"), and a
"destination" prompt that describes the same setting, but with a specified
desired attribute (e.g., "a pack of blue roses"). TIME then updates the model's
cross-attention layers, as these layers assign visual meaning to textual
tokens. We edit the projection matrices in these layers such that the source
prompt is projected close to the destination prompt. Our method is highly
efficient, as it modifies a mere 2.2% of the model's parameters in under one
second. To evaluate model editing approaches, we introduce TIMED (TIME
Dataset), containing 147 source and destination prompt pairs from various
domains. Our experiments (using Stable Diffusion) show that TIME is successful
in model editing, generalizes well for related prompts unseen during editing,
and imposes minimal effect on unrelated generations.