Unsupervised Learning of Object Landmarks by Factorized Spatial Embeddings

James Thewlis, Hakan Bilen, Andrea Vedaldi; Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 5916-5925

Abstract


Automatically learning the structure of object categories remains an important open problem in computer vision. We propose a novel unsupervised approach that can discover and learn to detect landmarks in object categories, thus characterizing their structure. Our approach is based on factorizing image deformations, as induced by a viewpoint change or an object articulation, by learning a deep neural network that detects landmarks compatible with such visual effects. We show that, by requiring the same neural network to be applicable to different object instances, our method naturally induces meaningful correspondences between different object instances in a category. We assess the method qualitatively on a variety of object types, natural an man-made. We also show that our unsupervised landmarks are highly predictive of manually-annotated landmarks in faces benchmark datasets, and can be used to regress those with a high degree of accuracy.

Related Material


[pdf] [supp] [arXiv] [video]
[bibtex]
@InProceedings{Thewlis_2017_ICCV,
author = {Thewlis, James and Bilen, Hakan and Vedaldi, Andrea},
title = {Unsupervised Learning of Object Landmarks by Factorized Spatial Embeddings},
booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
month = {Oct},
year = {2017}
}