DAC-GAN: Dual Auxiliary Consistency Generative Adversarial Network for Text-to-Image Generation

Zhiwei Wang, Jing Yang, Jiajun Cui, Jiawei Liu, Jiahao Wang; Proceedings of the Asian Conference on Computer Vision (ACCV), 2022, pp. 160-176

Abstract


Synthesizing an image from a given text encounters two major challenges: the integrity of images and the consistency of text-image pairs. Although many decent performances have been achieved, two crucial problems are still not considered adequately. (i) The object frame is prone to deviate or collapse, making subsequent refinement unavailable. (ii) The non-target regions of the image are affected by text which is highly conveyed through phrases, instead of words. Current methods barely employ the word-level clue, leaving coherent implication in phrases broken. To tackle the issues, we propose DAC-GAN, a Dual Auxiliary Consistency Generative Adversarial Network(DAC-GAN). Specifically, we simplify the generation by a single-stage structure with dual auxiliary modules. (1) Class-Aware skeleton Consistency(CAC) module retains the integrity of image by exploring additional supervision from prior knowledge and (2) Multi-label-Aware Consistency(MAC) module strengthens the alignment of text-image pairs at phrase-level. Comprehensive experiments on two widely-used datasets show that DAC-GAN can maintain the integrity of the target and enhance the consistency of text-image pairs.

Related Material


[pdf]
[bibtex]
@InProceedings{Wang_2022_ACCV, author = {Wang, Zhiwei and Yang, Jing and Cui, Jiajun and Liu, Jiawei and Wang, Jiahao}, title = {DAC-GAN: Dual Auxiliary Consistency Generative Adversarial Network for Text-to-Image Generation}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {December}, year = {2022}, pages = {160-176} }