Cross-modal Latent Space Alignment for Image to Avatar Translation

Manuel Ladron de Guevara, Jose Echevarria, Yijun Li, Yannick Hold-Geoffroy, Cameron Smith, Daichi Ito; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 520-529

Abstract


We present a novel method for automatic vectorized avatar generation from a single portrait image. Most existing approaches that create avatars rely on image-to-image translation methods, which present some limitations when applied to 3D rendering, animation, or video. Instead, we leverage modality-specific autoencoders trained on large-scale unpaired portraits and parametric avatars, and then learn a mapping between both modalities via an alignment module trained on a significantly smaller amount of data. The resulting cross-modal latent space preserves facial identity, producing more visually appealing and higher fidelity avatars than previous methods, as supported by our quantitative and qualitative evaluations. Moreover, our method's virtue of being resolution-independent makes it highly versatile and applicable in a wide range of settings.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{de_Guevara_2023_ICCV, author = {de Guevara, Manuel Ladron and Echevarria, Jose and Li, Yijun and Hold-Geoffroy, Yannick and Smith, Cameron and Ito, Daichi}, title = {Cross-modal Latent Space Alignment for Image to Avatar Translation}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2023}, pages = {520-529} }