Deforming Autoencoders: Unsupervised Disentangling of Shape and Appearance

Zhixin Shu, Mihir Sahasrabudhe, Riza Alp Guler, Dimitris Samaras, Nikos Paragios, Iasonas Kokkinos; Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 650-665


In this work we introduce the Deforming Autoencoder, a generative model for images that disentangles shape from appearance in a latent representation space that is learned in a fully unsupervised manner. As in the deformable template paradigm, shape is represented as a diffeomorphism between a canonical coordinate system (`template') and an observed image, while appearance is modeled in template coordinates, thus discarding variability due to deformations. We introduce novel techniques that allow this approach to be deployed in the setting of autoencoders and show that this method can be used for unsupervised group-wise image alignment. We show experiments with expression morphing in humans, hands, and digits, face manipulation, such as shape and appearance interpolation, as well as unsupervised landmark localization. A more powerful form of unsupervised disentangling becomes possible in template coordinates, allowing us to successfully decompose face images into shading and albedo, and further manipulate face images.

Related Material

author = {Shu, Zhixin and Sahasrabudhe, Mihir and Guler, Riza Alp and Samaras, Dimitris and Paragios, Nikos and Kokkinos, Iasonas},
title = {Deforming Autoencoders: Unsupervised Disentangling of Shape and Appearance},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}