Preference-Based Image Generation

Hadi Kazemi, Fariborz Taherkhani, Nasser Nasrabadi; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2020, pp. 3404-3413


Deep generative models are a set of promising methods, that are able to model complex data and generate new samples. In principle, they learn to map a random latent code sampled from a prior distribution into high dimensional data space, such as image space. However, these models have limited utilities as the user has minimal control over what the network produces. Despite the success of some recent work in learning an interpretable latent code, the field still lacks a coherent framework to learn a fully interpretable latent code, without any random part for sample diversity. Consequently, it is generally hard, if not impossible, for a non-expert user to produce the desired image by tuning the random and interpretable parts of the latent code. In this paper, we introduce the Preference-Based Image Generation (PbIG), a new method to retrieve the corresponding latent code of the user's mental image. We propose to adopt preference-based reinforcement learning, which learns from a user's judgment of the generated images by a pre-trained generative model. Since the proposed method is completely decoupled from the training stage of the underlying generative models, it can easily be adopted by any method, such as GANs and VAEs. We evaluate the effectiveness of PbIG framework using a set of experiments on baseline datasets using a pretraind StackGAN++.

Related Material

[pdf] [video]
author = {Kazemi, Hadi and Taherkhani, Fariborz and Nasrabadi, Nasser},
title = {Preference-Based Image Generation},
booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
month = {March},
year = {2020}