Improving the Transferability of Adversarial Samples With Adversarial Transformations
Although deep neural networks (DNNs) have achieved tremendous performance in diverse vision challenges, they are surprisingly susceptible to adversarial examples, which are born of intentionally perturbing benign samples in a human-imperceptible fashion. It thus poses security concerns on the deployment of DNNs in practice, particularly in safety- and security-sensitive domains. To investigate the robustness of DNNs, transfer-based attacks have attracted a growing interest recently due to their high practical applicability, where attackers craft adversarial samples with local models and employ the resultant samples to attack a remote black-box model. However, existing transfer-based attacks frequently suffer from low success rates due to overfitting to the adopted local model. To boost the transferability of adversarial samples, we propose to improve the robustness of synthesized adversarial samples via adversarial transformations. Specifically, we employ an adversarial transformation network to model the most harmful distortions that can destroy adversarial noises and require the synthesized adversarial samples to become resistant to such adversarial transformations. Extensive experiments on the ImageNet benchmark showcase the superiority of our method to state-of-the-art baselines in attacking both undefended and defended models.