One-step Diffusion with Distribution Matching Distillation

Tianwei Yin, Michaël Gharbi, Richard Zhang, Eli Shechtman, Frédo Durand, William T. Freeman, Taesung Park; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 6613-6623

Abstract


Diffusion models generate high-quality images but require dozens of forward passes. We introduce Distribution Matching Distillation (DMD) a procedure to transform a diffusion model into a one-step image generator with minimal impact on image quality. We enforce the one-step image generator match the diffusion model at distribution level by minimizing an approximate KL divergence whose gradient can be expressed as the difference between 2 score functions one of the target distribution and the other of the synthetic distribution being produced by our one-step generator. The score functions are parameterized as two diffusion models trained separately on each distribution. Combined with a simple regression loss matching the large-scale structure of the multi-step diffusion outputs our method outperforms all published few-step diffusion approaches reaching 2.62 FID on ImageNet 64x64 and 11.49 FID on zero-shot COCO-30k comparable to Stable Diffusion but orders of magnitude faster. Utilizing FP16 inference our model can generate images at 20 FPS on modern hardware.

Related Material


[pdf] [arXiv]
[bibtex]
@InProceedings{Yin_2024_CVPR, author = {Yin, Tianwei and Gharbi, Micha\"el and Zhang, Richard and Shechtman, Eli and Durand, Fr\'edo and Freeman, William T. and Park, Taesung}, title = {One-step Diffusion with Distribution Matching Distillation}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {6613-6623} }