Adversarial Distribution Matching for Diffusion Distillation Towards Efficient Image and Video Synthesis

Lu, Yanzuo; Ren, Yuxi; Xia, Xin; Lin, Shanchuan; Wang, Xing; Xiao, Xuefeng; Ma, Andy J.; Xie, Xiaohua; Lai, Jian-Huang

Yanzuo Lu, Yuxi Ren, Xin Xia, Shanchuan Lin, Xing Wang, Xuefeng Xiao, Andy J. Ma, Xiaohua Xie, Jian-Huang Lai; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2025, pp. 16818-16829

Abstract

Distribution Matching Distillation (DMD) is a promising score distillation technique that compresses pre-trained teacher diffusion models into efficient one-step or multi-step student generators.Nevertheless, its reliance on the reverse Kullback-Leibler (KL) divergence minimization potentially induces mode collapse (or mode-seeking) in certain applications.To circumvent this inherent drawback, we propose Adversarial Distribution Matching (ADM), a novel framework that leverages diffusion-based discriminators to align the latent predictions between real and fake score estimators for score distillation in an adversarial manner.In the context of extremely challenging one-step distillation, we further improve the pre-trained generator by adversarial distillation with hybrid discriminators in both latent and pixel spaces.Different from the mean squared error used in DMD2 pre-training, our method incorporates the distributional loss on ODE pairs collected from the teacher model, and thus providing a better initialization for score distillation fine-tuning in the next stage.By combining the adversarial distillation pre-training with ADM fine-tuning into a unified pipeline termed DMDX, our proposed method achieves superior one-step performance on SDXL compared to DMD2 while consuming less GPU time.Additional experiments that apply multi-step ADM distillation on SD3-Medium, SD3.5-Large, and CogVideoX set a new benchmark towards efficient image and video synthesis.

Related Material

[pdf] [supp] [arXiv]

[bibtex]

@InProceedings{Lu_2025_ICCV, author = {Lu, Yanzuo and Ren, Yuxi and Xia, Xin and Lin, Shanchuan and Wang, Xing and Xiao, Xuefeng and Ma, Andy J. and Xie, Xiaohua and Lai, Jian-Huang}, title = {Adversarial Distribution Matching for Diffusion Distillation Towards Efficient Image and Video Synthesis}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2025}, pages = {16818-16829} }