- [pdf] [supp] [arXiv]
Towards Compositional Adversarial Robustness: Generalizing Adversarial Training to Composite Semantic Perturbations
Model robustness against adversarial examples of single perturbation type such as the Lp-norm has been widely studied, yet its generalization to more realistic scenarios involving multiple semantic perturbations and their composition remains largely unexplored. In this paper, we first propose a novel method for generating composite adversarial examples. Our method can find the optimal attack composition by utilizing component-wise projected gradient descent and automatic attack-order scheduling. We then propose generalized adversarial training (GAT) to extend model robustness from Lp-ball to composite semantic perturbations, such as the combination of Hue, Saturation, Brightness, Contrast, and Rotation. Results obtained using ImageNet and CIFAR-10 datasets indicate that GAT can be robust not only to all the tested types of a single attack, but also to any combination of such attacks. GAT also outperforms baseline L-infinity-norm bounded adversarial training approaches by a significant margin.