-
[pdf]
[bibtex]@InProceedings{Wang_2025_CVPR, author = {Wang, Jialai and Wu, Yuxiao and Xu, Weiye and Huang, Yating and Zhang, Chao and Li, Zongpeng and Xu, Mingwei and Liang, Zhenkai}, title = {Your Scale Factors are My Weapon: Targeted Bit-Flip Attacks on Vision Transformers via Scale Factor Manipulation}, booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)}, month = {June}, year = {2025}, pages = {20103-20112} }
Your Scale Factors are My Weapon: Targeted Bit-Flip Attacks on Vision Transformers via Scale Factor Manipulation
Abstract
Vision Transformers (ViTs) have experienced significant progress and are quantized for deployment in resource-constrained applications. Quantized models are vulnerable to targeted bit-flip attacks (BFAs). A targeted BFA prepares a trigger and a corresponding Trojan/backdoor, inserting the latter (with RowHammer bit flipping) into a victim model, to mislead its classification on samples containing the trigger. Existing targeted BFAs on quantized ViTs are limited in that: (1) they require numerous bit-flips, and (2) the separation between flipped bits is below 4 KB, making attacks infeasible with RowHammer in real-world scenarios. We propose a new and practical targeted attack Flip-S against quantized ViTs. The core insight is that in quantized models, a scale factor change ripples through a batch of model weights. Consequently, flipping bits in scale factors, rather than solely in model weights, enables more cost-effective attacks. We design a Scale-Factor-Search (SFS) algorithm to identify critical bits in scale factors for flipping, and adopt a mutual exclusion strategy to guarantee a 4 KB separation between flips. We evaluate Flip-S on CIFAR-10 and ImageNet datasets across five ViT architectures and two quantization levels. Results show that Flip-S achieves attack success rate (ASR) exceeding 90.0% on all models with 50 bits flipped, outperforming baselines with ASR typically below 80.0%. Furthermore, compared to the SOTA, Flip-S reduces the number of required bit-flips by 8x-20xwhile reaching equal or higher ASR. Our source code is publicly available.
Related Material