Condition-Aware Neural Network for Controlled Image Generation

Han Cai, Muyang Li, Qinsheng Zhang, Ming-Yu Liu, Song Han; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 7194-7203

Abstract


We present Condition-Aware Neural Network (CAN) a new method for adding control to image generative models. In parallel to prior conditional control methods CAN controls the image generation process by dynamically manipulating the weight of the neural network. This is achieved by introducing a condition-aware weight generation module that generates conditional weight for convolution/linear layers based on the input condition. We test CAN on class-conditional image generation on ImageNet and text-to-image generation on COCO. CAN consistently delivers significant improvements for diffusion transformer models including DiT and UViT. In particular CAN combined with EfficientViT (CaT) achieves 2.78 FID on ImageNet 512x512 surpassing DiT-XL/2 while requiring 52x fewer MACs per sampling step.

Related Material


[pdf] [arXiv]
[bibtex]
@InProceedings{Cai_2024_CVPR, author = {Cai, Han and Li, Muyang and Zhang, Qinsheng and Liu, Ming-Yu and Han, Song}, title = {Condition-Aware Neural Network for Controlled Image Generation}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {7194-7203} }