FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix Approximation

Wu, Zhuguanyu; Wang, Shihe; Zhang, Jiayi; Chen, Jiaxin; Wang, Yunhong

Zhuguanyu Wu, Shihe Wang, Jiayi Zhang, Jiaxin Chen, Yunhong Wang; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025, pp. 14891-14900

Abstract

Post-training quantization (PTQ) has stood out as a cost-effective and promising model compression approach over recent years, as it eliminates the need for retraining on the entire dataset. Unfortunately, most existing PTQ methods for Vision Transformers (ViTs) exhibit a notable drop in accuracy, especially in low-bit cases. To tackle these challenges, we analyze the extensively utilized Hessian-guided quantization loss, and uncover certain limitations within the approximated pre-activation Hessian. Following the block-by-block reconstruction paradigm of PTQ, we first derive a quantization loss based on the Fisher Information Matrix (FIM). Due to the large scale of the complete FIM, we establish the relationship between KL divergence and FIM in the PTQ scenario to enable fast computation of the quantization loss during reconstruction. Subsequently, we develop a Diagonal Plus Low-Rank (DPLR) estimation on FIM to achieve a more nuanced quantization loss. Our extensive experiments, conducted across various vision tasks with distinct representative ViT-based architectures on public benchmark datasets, demonstrate that our method outperforms the state-of-the-art approaches, especially in the case of low-bit quantization. The source code is available at https://github.com/ShiheWang/FIMA-Q.

Related Material

[pdf] [supp]

[bibtex]

@InProceedings{Wu_2025_CVPR, author = {Wu, Zhuguanyu and Wang, Shihe and Zhang, Jiayi and Chen, Jiaxin and Wang, Yunhong}, title = {FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix Approximation}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2025}, pages = {14891-14900} }