Efficient Scale-Invariant Generator With Column-Row Entangled Pixel Synthesis

Thuan Hoang Nguyen, Thanh Van Le, Anh Tran; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 22408-22417

Abstract


Any-scale image synthesis offers an efficient and scalable solution to synthesize photo-realistic images at any scale, even going beyond 2K resolution. However, existing GAN-based solutions depend excessively on convolutions and a hierarchical architecture, which introduce inconsistency and the "texture sticking" issue when scaling the output resolution. From another perspective, INR-based generators are scale-equivariant by design, but their huge memory footprint and slow inference hinder these networks from being adopted in large-scale or real-time systems. In this work, we propose Column-Row Entangled Pixel Synthesisthes (CREPS), a new generative model that is both efficient and scale-equivariant without using any spatial convolutions or coarse-to-fine design. To save memory footprint and make the system scalable, we employ a novel bi-line representation that decomposes layer-wise feature maps into separate "thick" column and row encodings. Experiments on standard datasets, including FFHQ, LSUN-Church, and MetFaces, confirm CREPS' ability to synthesize scale-consistent and alias-free images up to 4K resolution with proper training and inference speed.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Nguyen_2023_CVPR, author = {Nguyen, Thuan Hoang and Van Le, Thanh and Tran, Anh}, title = {Efficient Scale-Invariant Generator With Column-Row Entangled Pixel Synthesis}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2023}, pages = {22408-22417} }