QuantAttack: Exploiting Quantization Techniques to Attack Vision Transformers

Baras, Amit; Zolfi, Alon; Elovici, Yuval; Shabtai, Asaf

Amit Baras, Alon Zolfi, Yuval Elovici, Asaf Shabtai; Proceedings of the Winter Conference on Applications of Computer Vision (WACV), 2025, pp. 6730-6740

Abstract

In recent years there has been a significant trend in deep neural networks (DNNs) particularly transformer-based models of developing ever-larger and more capable models. While they demonstrate state-of-the-art performance their growing scale requires increased computational resources (e.g. GPUs with greater memory capacity). To address this problem quantization techniques (i.e. low-bit-precision representation and matrix multiplication) have been proposed. Most quantization techniques employ a static strategy in which the model parameters are quantized either during training or inference without considering the test-time sample. In contrast dynamic quantization techniques which have become increasingly popular adapt during inference based on the input provided while maintaining full-precision performance. However their dynamic behavior and average-case performance assumption makes them vulnerable to a novel threat vector - adversarial attacks that target the model's efficiency and availability. In this paper we present QuantAttack a novel attack that targets the availability of quantized vision transformers slowing down the inference and increasing memory usage and energy consumption. The source code is available online.

Related Material

[pdf] [supp]

[bibtex]

@InProceedings{Baras_2025_WACV, author = {Baras, Amit and Zolfi, Alon and Elovici, Yuval and Shabtai, Asaf}, title = {QuantAttack: Exploiting Quantization Techniques to Attack Vision Transformers}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)}, month = {February}, year = {2025}, pages = {6730-6740} }