Improving Transferable Targeted Adversarial Attacks with Model Self-Enhancement

Wu, Han; Ou, Guanyan; Wu, Weibin; Zheng, Zibin

Han Wu, Guanyan Ou, Weibin Wu, Zibin Zheng; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 24615-24624

Abstract

Various transfer attack methods have been proposed to evaluate the robustness of deep neural networks (DNNs). Although manifesting remarkable performance in generating untargeted adversarial perturbations existing proposals still fail to achieve high targeted transferability. In this work we discover that the adversarial perturbations' overfitting towards source models of mediocre generalization capability can hurt their targeted transferability. To address this issue we focus on enhancing the source model's generalization capability to improve its ability to conduct transferable targeted adversarial attacks. In pursuit of this goal we propose a novel model self-enhancement method that incorporates two major components: Sharpness-Aware Self-Distillation (SASD) and Weight Scaling (WS). Specifically SASD distills a fine-tuned auxiliary model which mirrors the source model's structure into the source model while flattening the source model's loss landscape. WS obtains an approximate ensemble of numerous pruned models to perform model augmentation which can be conveniently synergized with SASD to elevate the source model's generalization capability and thus improve the resultant targeted perturbations' transferability. Extensive experiments corroborate the effectiveness of the proposed method. Notably under the black-box setting our approach can outperform the state-of-the-art baselines by a significant margin of 12.2% on average in terms of the obtained targeted transferability. Code is available at https://github.com/g4alllf/SASD.

Related Material

[pdf]

[bibtex]

@InProceedings{Wu_2024_CVPR, author = {Wu, Han and Ou, Guanyan and Wu, Weibin and Zheng, Zibin}, title = {Improving Transferable Targeted Adversarial Attacks with Model Self-Enhancement}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {24615-24624} }