Optimization-Free Image Immunization Against Diffusion-Based Editing

1University of Illinois Urbana-Champaign 2The University of Texas at Austin
3Bogazici University 4University of North Carolina at Chapel Hill
*Equal contribution

Work done during an internship at UT Austin and UIUC
Immunization Examples

DiffVax is an optimization-free image immunization approach designed to protect images and videos from diffusion-based editing. DiffVax demonstrates robustness across diverse content, providing protection for both in-the-wild (a) unseen images and (b) unseen video content while effectively preventing edits across various editing methods, including inpainting (illustrated with a human in the left column and a non-human foreground object in the right column) and instruction-based edits (right column).

Abstract

Current image immunization defense techniques against diffusion-based editing embed imperceptible noise in target images to disrupt editing models. However, these methods face scalability challenges, as they require time-consuming re-optimization for each image—taking hours for small batches. To address these challenges, we introduce DiffVax, a scalable, lightweight, and optimization-free framework for image immunization, specifically designed to prevent diffusion-based editing. Our approach enables effective generalization to unseen content, reducing computational costs and cutting immunization time from days to milliseconds—achieving a 250,000$\times$ speedup. This is achieved through a loss term that ensures the failure of editing attempts and the imperceptibility of the perturbations. Extensive qualitative and quantitative results demonstrate that our model is scalable, optimization-free, adaptable to various diffusion-based editing tools, robust against counter-attacks, and, for the first time, effectively protects video content from editing. Our code and qualitative results are provided in the supplementary.

Method

model_overview

Our process begins with the immunizer model $f(\cdot;\theta)$ which generates imperceptible noise $\epsilon_{im}$ to be applied to original image $I$. This noise is applied to the masked region $M$ of the image, resulting in immunized image $I_{im}$. The immunized image is then processed by a diffusion-based editing model $SD(\cdot)$ using a text prompt $P$ and the complementary mask $\sim M$ to edit the background of the original image. The training aims to minimize two loss terms $\mathcal{L}_{noise}$ and $\mathcal{L}_{edit}$, which penalizes the applied noise magnitude, and if the edit is successful, respectively. During training, the immunizer learns to generalize across diverse images, ensuring editing attempts fail while preserving visual fidelity. This end-to-end framework enables robust, scalable immunization against diffusion-based editing for both images and videos.

DiffVax Immunization Results

Prompt

Original Image

Edited Image

Edited Immunized Image

"in a prison"

model_overview
model_overview
model_overview

"Geoffrey Hinton at a political protest"

model_overview
model_overview
model_overview

"an eagle sitting on a table in a library"

model_overview
model_overview
model_overview

"add sunglasses"

model_overview
model_overview
model_overview

"standing in an abandoned carnival"

model_overview
model_overview
model_overview

"watching a theater performance"

model_overview
model_overview
model_overview

"in front of a hotdog stand"

model_overview
model_overview
model_overview

"under a turbulent sky with lightning"

model_overview
model_overview
model_overview

"in a garage"

model_overview
model_overview
model_overview

"in a church with wooden pews"

model_overview
model_overview
model_overview

Original Video

Edited Video

Immunized Edited Video

"in snowstorm"

Comparisons

Prompt

Original Image

Edited Image

Random Noise
Edited Image

PhotoGuard-E ([1])
Edited Image

PhotoGuard-D ([1])
Edited Image

DiffVax (Ours)
Edited Image

"standing in a warehouse with a lot of shelves"

model_overview
model_overview
model_overview
model_overview
model_overview
model_overview

"in a betting shop"

model_overview
model_overview
model_overview
model_overview
model_overview
model_overview

"working out in a gymnasium"

model_overview
model_overview
model_overview
model_overview
model_overview
model_overview

Robustness Against Counter Attacks

Original Image

Edited PhotoGuard-D

Edited Attacked PhotoGuard-D

Edited DiffVax (Ours)

Edited Attacked DiffVax

Denoiser Attack
Prompt: "a person in a cinema"

model_overview
model_overview
model_overview
model_overview
model_overview

JPEG Compression
Prompt: "in a health and wellness center"

model_overview
model_overview
model_overview
model_overview
model_overview

BibTeX

@article{ozden2024optimization,
      title={Optimization-Free Image Immunization Against Diffusion-Based Editing},
      author={Ozden, Tarik Can and Kara, Ozgur and Akcin, Oguzhan and Zaman, Kerem and Srivastava, Shashank and Chinchali, Sandeep P and Rehg, James M},
      journal={arXiv preprint arXiv:2411.17957},
      year={2024}
}

[1] Hadi Salman, Alaa Khaddaj, Guillaume Leclerc, Andrew Ilyas, and Aleksander Madry. Raising the Cost of Malicious AI-Powered Image Editing. In International Conference on Machine Learning (ICML), pages 29894--29918, 2023.