AdvDiffuser: Natural Adversarial Example Synthesis with Diffusion Models

Chen, Xinquan; Gao, Xitong; Zhao, Juanjuan; Ye, Kejiang; Xu, Cheng-Zhong

Xinquan Chen, Xitong Gao, Juanjuan Zhao, Kejiang Ye, Cheng-Zhong Xu; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 4562-4572

Abstract

Previous work on adversarial examples typically involves a fixed norm perturbation budget, which fails to capture the way humans perceive perturbations. Recent work has shifted towards investigating natural unrestricted adversarial examples (UAEs) that breaks l_p perturbation bounds but nonetheless remain semantically plausible. Current methods use GAN or VAE to generate UAEs by perturbing latent codes. However, this leads to loss of high-level information, resulting in low-quality and unnatural UAEs. In light of this, we propose AddDiffuser, a new method for synthesizing natural UAEs using diffusion models. It can generate UAEs from scratch or conditionally based on reference images. To generate natural UAEs, we perturb predicted images to steer their latent code towards the adversarial sample space of a particular classifier. In addition, we propose adversarial inpainting based on class activation mapping to retain the salient regions of the image while perturbing less important areas. Our method achieves impressive results on CIFAR-10, CelebA and ImageNet, and we demonstrate that it can defeat the most robust models on the RobustBench leaderboard with near 100% success rates. Furthermore, The synthesized UAEs are not only more natural but also stronger compared to the current state-of-the-art attacks. Specifically, compared with GA-attack, the UAEs generated with AdvDiffuser exhibit 6xsmaller LPIPS perturbations, 2 ~ 3 xsmaller FID scores and 0.28 higher in SSIM metrics, making them perceptually stealthier. Lastly, it is capable of generating an unlimited number of natural adversarial examples. For more please visit our project page: Link to follow.

Related Material

[pdf]

[bibtex]

@InProceedings{Chen_2023_ICCV, author = {Chen, Xinquan and Gao, Xitong and Zhao, Juanjuan and Ye, Kejiang and Xu, Cheng-Zhong}, title = {AdvDiffuser: Natural Adversarial Example Synthesis with Diffusion Models}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2023}, pages = {4562-4572} }