-
[pdf]
[supp]
[bibtex]@InProceedings{Miao_2024_CVPR, author = {Miao, Zichen and Wang, Jiang and Wang, Ze and Yang, Zhengyuan and Wang, Lijuan and Qiu, Qiang and Liu, Zicheng}, title = {Training Diffusion Models Towards Diverse Image Generation with Reinforcement Learning}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {10844-10853} }
Training Diffusion Models Towards Diverse Image Generation with Reinforcement Learning
Abstract
Diffusion models have demonstrated unprecedented capabilities in image generation. Yet they incorporate and amplify the data bias (e.g. gender age) from the original training set limiting the diversity of generated images. In this paper we propose a diversity-oriented fine-tuning method using reinforcement learning (RL) for diffusion models under the guidance of an image-set-based reward function. Specifically the proposed reward function denoted as Diversity Reward utilizes a set of generated images to evaluate the coverage of the current generative distribution w.r.t. the reference distribution represented by a set of unbiased images. Built on top of the probabilistic method of distribution discrepancy estimation Diversity Reward can measure the relative distribution gap with a small set of images efficiently. We further formulate the diffusion process as a multi-step decision-making problem (MDP) and apply policy gradient methods to fine-tune diffusion models by maximizing the Diversity Reward. The proposed rewards are validated on a post-sampling selection task where a subset of the most diverse images are selected based on Diversity Reward values. We also show the effectiveness of our RL fine-tuning framework on enhancing the diversity of image generation with different types of diffusion models including class-conditional models and text-conditional models e.g. StableDiffusion.
Related Material