TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models

Huang, Yushi; Gong, Ruihao; Liu, Jing; Chen, Tianlong; Liu, Xianglong

Yushi Huang, Ruihao Gong, Jing Liu, Tianlong Chen, Xianglong Liu; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 7362-7371

Abstract

The Diffusion model a prevalent framework for image generation encounters significant challenges in terms of broad applicability due to its extended inference times and substantial memory requirements. Efficient Post-training Quantization (PTQ) is pivotal for addressing these issues in traditional models. Different from traditional models diffusion models heavily depend on the time-step t to achieve satisfactory multi-round denoising. Usually t from the finite set \ 1 \ldots T\ is encoded to a temporal feature by a few modules totally irrespective of the sampling data. However existing PTQ methods do not optimize these modules separately. They adopt inappropriate reconstruction targets and complex calibration methods resulting in a severe disturbance of the temporal feature and denoising trajectory as well as a low compression efficiency. To solve these we propose a Temporal Feature Maintenance Quantization (TFMQ) framework building upon a Temporal Information Block which is just related to the time-step t and unrelated to the sampling data. Powered by the pioneering block design we devise temporal information aware reconstruction (TIAR) and finite set calibration (FSC) to align the full-precision temporal features in a limited time. Equipped with the framework we can maintain the most temporal information and ensure the end-to-end generation quality. Extensive experiments on various datasets and diffusion models prove our state-of-the-art results. Remarkably our quantization approach for the first time achieves model performance nearly on par with the full-precision model under 4-bit weight quantization. Additionally our method incurs almost no extra computational cost and accelerates quantization time by 2.0 xon LSUN-Bedrooms 256 x256 compared to previous works. Our code is publicly available at \href https://github.com/ModelTC/TFMQ-DM https://github.com/ModelTC/TFMQ-DM .

Related Material

[pdf] [supp]

[bibtex]

@InProceedings{Huang_2024_CVPR, author = {Huang, Yushi and Gong, Ruihao and Liu, Jing and Chen, Tianlong and Liu, Xianglong}, title = {TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {7362-7371} }