Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Paper page - When the Prompt Becomes Visual: Vision-Centric Jailbreak Attacks for Large Image Editing Models
[go: Go Back, main page]

https://csu-jpg.github.io/vja.github.io/

\n","updatedAt":"2026-02-12T02:57:39.589Z","author":{"_id":"62333a88fd7bb4a39b92d387","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62333a88fd7bb4a39b92d387/e21AhpcXq37Ak_7rZ-Ca9.png","fullname":"Alex Jinpeng Wang","name":"Awiny","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":11,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.5248260498046875},"editors":["Awiny"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/62333a88fd7bb4a39b92d387/e21AhpcXq37Ak_7rZ-Ca9.png"],"reactions":[],"isReport":false}},{"id":"698e80b2a669e919b82dbb48","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false},"createdAt":"2026-02-13T01:38:58.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is an automated message from the [Librarian Bot](https://huggingface.co/librarian-bots). I found the following papers similar to this paper. \n\nThe following papers were recommended by the Semantic Scholar API \n\n* [How Well Do Models Follow Visual Instructions? VIBE: A Systematic Benchmark for Visual Instruction-Driven Image Editing](https://huggingface.co/papers/2602.01851) (2026)\n* [Text is All You Need for Vision-Language Model Jailbreaking](https://huggingface.co/papers/2602.00420) (2026)\n* [Jailbreaks on Vision Language Model via Multimodal Reasoning](https://huggingface.co/papers/2601.22398) (2026)\n* [Lingua-SafetyBench: A Benchmark for Safety Evaluation of Multilingual Vision-Language Models](https://huggingface.co/papers/2601.22737) (2026)\n* [RePlan: Reasoning-guided Region Planning for Complex Instruction-based Image Editing](https://huggingface.co/papers/2512.16864) (2025)\n* [Semantic Mismatch and Perceptual Degradation: A New Perspective on Image Editing Immunity](https://huggingface.co/papers/2512.14320) (2025)\n* [Empowering Reliable Visual-Centric Instruction Following in MLLMs](https://huggingface.co/papers/2601.03198) (2026)\n\n\n Please give a thumbs up to this comment if you found it helpful!\n\n If you want recommendations for any Paper on Hugging Face checkout [this](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers) Space\n\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: `@librarian-bot recommend`","html":"

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

\n

The following papers were recommended by the Semantic Scholar API

\n\n

Please give a thumbs up to this comment if you found it helpful!

\n

If you want recommendations for any Paper on Hugging Face checkout this Space

\n

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend

\n","updatedAt":"2026-02-13T01:38:58.725Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":318,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7269775867462158},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2602.10179","authors":[{"_id":"698d419065c0d15a6d1620ff","name":"Jiacheng Hou","hidden":false},{"_id":"698d419065c0d15a6d162100","name":"Yining Sun","hidden":false},{"_id":"698d419065c0d15a6d162101","name":"Ruochong Jin","hidden":false},{"_id":"698d419065c0d15a6d162102","name":"Haochen Han","hidden":false},{"_id":"698d419065c0d15a6d162103","name":"Fangming Liu","hidden":false},{"_id":"698d419065c0d15a6d162104","name":"Wai Kin Victor Chan","hidden":false},{"_id":"698d419065c0d15a6d162105","name":"Alex Jinpeng Wang","hidden":false}],"publishedAt":"2026-02-10T18:59:55.000Z","submittedOnDailyAt":"2026-02-12T00:27:39.577Z","title":"When the Prompt Becomes Visual: Vision-Centric Jailbreak Attacks for Large Image Editing Models","submittedOnDailyBy":{"_id":"62333a88fd7bb4a39b92d387","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62333a88fd7bb4a39b92d387/e21AhpcXq37Ak_7rZ-Ca9.png","isPro":false,"fullname":"Alex Jinpeng Wang","user":"Awiny","type":"user"},"summary":"Recent advances in large image editing models have shifted the paradigm from text-driven instructions to vision-prompt editing, where user intent is inferred directly from visual inputs such as marks, arrows, and visual-text prompts. While this paradigm greatly expands usability, it also introduces a critical and underexplored safety risk: the attack surface itself becomes visual. In this work, we propose Vision-Centric Jailbreak Attack (VJA), the first visual-to-visual jailbreak attack that conveys malicious instructions purely through visual inputs. To systematically study this emerging threat, we introduce IESBench, a safety-oriented benchmark for image editing models. Extensive experiments on IESBench demonstrate that VJA effectively compromises state-of-the-art commercial models, achieving attack success rates of up to 80.9% on Nano Banana Pro and 70.1% on GPT-Image-1.5. To mitigate this vulnerability, we propose a training-free defense based on introspective multimodal reasoning, which substantially improves the safety of poorly aligned models to a level comparable with commercial systems, without auxiliary guard models and with negligible computational overhead. Our findings expose new vulnerabilities, provide both a benchmark and practical defense to advance safe and trustworthy modern image editing systems. Warning: This paper contains offensive images created by large image editing models.","upvotes":6,"discussionId":"698d419165c0d15a6d162106","ai_summary":"Visual-to-visual jailbreak attacks compromise image editing models through malicious visual inputs, necessitating new safety benchmarks and defense mechanisms.","ai_keywords":["Vision-Centric Jailbreak Attack","image editing models","visual-to-visual attack","IESBench","introspective multimodal reasoning","safety-oriented benchmark","attack success rate"],"organization":{"_id":"67ab7720792eebb05080c926","name":"CSU-JPG","fullname":"Jinpeng Group","avatar":"https://cdn-uploads.huggingface.co/production/uploads/62333a88fd7bb4a39b92d387/MHfLrhVz0KqH6ydx1UrOc.jpeg"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"62333a88fd7bb4a39b92d387","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62333a88fd7bb4a39b92d387/e21AhpcXq37Ak_7rZ-Ca9.png","isPro":false,"fullname":"Alex Jinpeng Wang","user":"Awiny","type":"user"},{"_id":"67adce7bae805666dbf03fde","avatarUrl":"/avatars/91db46c6363fbee26e9c7e088e23bb83.svg","isPro":false,"fullname":"Haochen Han","user":"hhc2077","type":"user"},{"_id":"68636f086da1f3f0ef1f473d","avatarUrl":"/avatars/d43bfd84e0f2791c5f10694f90d81774.svg","isPro":false,"fullname":"JayceonHo","user":"JayceonHo","type":"user"},{"_id":"68fc3ddcdc9e5cbf49cbc716","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/68fc3ddcdc9e5cbf49cbc716/gzvksq-XgWnekB6Xl25pw.jpeg","isPro":false,"fullname":"EasonYe","user":"EasonUwU","type":"user"},{"_id":"690c6c1791fe8ea7642d0393","avatarUrl":"/avatars/b142fdc3c30346df7194a4cfd2e2ab7f.svg","isPro":false,"fullname":"Sun Yining","user":"THUOTAKU","type":"user"},{"_id":"6039478ab3ecf716b1a5fd4d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6039478ab3ecf716b1a5fd4d/_Thy4E7taiSYBLKxEKJbT.jpeg","isPro":true,"fullname":"taesiri","user":"taesiri","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0,"organization":{"_id":"67ab7720792eebb05080c926","name":"CSU-JPG","fullname":"Jinpeng Group","avatar":"https://cdn-uploads.huggingface.co/production/uploads/62333a88fd7bb4a39b92d387/MHfLrhVz0KqH6ydx1UrOc.jpeg"}}">
Papers
arxiv:2602.10179

When the Prompt Becomes Visual: Vision-Centric Jailbreak Attacks for Large Image Editing Models

Published on Feb 10
· Submitted by
Alex Jinpeng Wang
on Feb 12
Authors:
,
,
,
,
,
,

Abstract

Visual-to-visual jailbreak attacks compromise image editing models through malicious visual inputs, necessitating new safety benchmarks and defense mechanisms.

AI-generated summary

Recent advances in large image editing models have shifted the paradigm from text-driven instructions to vision-prompt editing, where user intent is inferred directly from visual inputs such as marks, arrows, and visual-text prompts. While this paradigm greatly expands usability, it also introduces a critical and underexplored safety risk: the attack surface itself becomes visual. In this work, we propose Vision-Centric Jailbreak Attack (VJA), the first visual-to-visual jailbreak attack that conveys malicious instructions purely through visual inputs. To systematically study this emerging threat, we introduce IESBench, a safety-oriented benchmark for image editing models. Extensive experiments on IESBench demonstrate that VJA effectively compromises state-of-the-art commercial models, achieving attack success rates of up to 80.9% on Nano Banana Pro and 70.1% on GPT-Image-1.5. To mitigate this vulnerability, we propose a training-free defense based on introspective multimodal reasoning, which substantially improves the safety of poorly aligned models to a level comparable with commercial systems, without auxiliary guard models and with negligible computational overhead. Our findings expose new vulnerabilities, provide both a benchmark and practical defense to advance safe and trustworthy modern image editing systems. Warning: This paper contains offensive images created by large image editing models.

Community

Paper submitter

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2602.10179 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2602.10179 in a Space README.md to link it from this page.

Collections including this paper 1