SLI-pSp: Injecting Multi-Scale Spatial Layout in pSp

Aradhya Neeraj Mathur, Anish Madan, Ojaswa Sharma; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023, pp. 4095-4104

Abstract


We propose SLI-pSp, a general purpose Image-to-Image (I2I) translation model that encodes spatial layout information as well as style in the generator, using pSp as the base architecture. Previous methods like pSp have shown promising results by leveraging StyleGAN as a generator in various I2I tasks but they seem to miss finer or under-represented details in facial images like earrings and caps, and break down on complex datasets due to their solely global approach. To address these shortcomings, we propose a technique termed Spatial Layout Injection (SLI-pSp) that encodes spatial layout information in the input image in the StyleGAN generator along with style. We do so without modifying the style vector injection in the generator through pSp's map2style network, but rather by combining SLI with noise layers in the StyleGAN generator at multiple spatial scales. Such an approach helps preserve global aspects of image generation as well as enhance spatial layout details in the output. We experiment on several challenging datasets and across several I2I tasks that highlight the effectiveness of our approach over previous methods with respect to finer details in the generated image and overall visual quality.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Mathur_2023_WACV, author = {Mathur, Aradhya Neeraj and Madan, Anish and Sharma, Ojaswa}, title = {SLI-pSp: Injecting Multi-Scale Spatial Layout in pSp}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2023}, pages = {4095-4104} }