MMA-Diffusion: MultiModal Attack on Diffusion Models

Yang, Yijun; Gao, Ruiyuan; Wang, Xiaosen; Ho, Tsung-Yi; Xu, Nan; Xu, Qiang

Yijun Yang, Ruiyuan Gao, Xiaosen Wang, Tsung-Yi Ho, Nan Xu, Qiang Xu; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 7737-7746

Abstract

In recent years Text-to-Image (T2I) models have seen remarkable advancements gaining widespread adoption. However this progress has inadvertently opened avenues for potential misuse particularly in generating inappropriate or Not-Safe-For-Work (NSFW) content. Our work introduces MMA-Diffusion a framework that presents a significant and realistic threat to the security of T2I models by effectively circumventing current defensive measures in both open-source models and commercial online services. Unlike previous approaches MMA-Diffusion leverages both textual and visual modalities to bypass safeguards like prompt filters and post-hoc safety checkers thus exposing and highlighting the vulnerabilities in existing defense mechanisms. Our codes are available at https://github.com/cure-lab/MMA-Diffusion.

Related Material

[pdf] [supp]

[bibtex]

@InProceedings{Yang_2024_CVPR, author = {Yang, Yijun and Gao, Ruiyuan and Wang, Xiaosen and Ho, Tsung-Yi and Xu, Nan and Xu, Qiang}, title = {MMA-Diffusion: MultiModal Attack on Diffusion Models}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {7737-7746} }