Leveraging Datasets With Varying Annotations for Face Alignment via Deep Regression Network

Jie Zhang, Meina Kan, Shiguang Shan, Xilin Chen; Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2015, pp. 3801-3809

Abstract


Facial landmark detection, as a vital topic in computer vision, has been studied for many decades and lots of datasets have been collected for evaluation. These datasets usually have different annotations, e.g., 68-landmark markup for LFPW dataset, while 74-landmark markup for GTAV dataset. Intuitively, it is meaningful to fuse all the datasets to predict a union of all types of landmarks from multiple datasets (i.e., transfer the annotations of each dataset to all other datasets), but this problem is nontrivial due to the distribution discrepancy between datasets and incomplete annotations of all types for each dataset. In this work, we propose a deep regression network coupled with sparse shape regression (DRN-SSR) to predict the union of all types of landmarks by leveraging datasets with varying annotations, each dataset with one type of annotation. Specifically, the deep regression network intends to predict the union of all landmarks, and the sparse shape regression attempts to approximate those undefined landmarks on each dataset so as to guide the learning of the deep regression network for face alignment. Extensive experiments on two challenging datasets, IBUG and GLF, demonstrate that our method can effectively leverage the multiple datasets with different annotations to predict the union of all types of landmarks.

Related Material


[pdf]
[bibtex]
@InProceedings{Zhang_2015_ICCV,
author = {Zhang, Jie and Kan, Meina and Shan, Shiguang and Chen, Xilin},
title = {Leveraging Datasets With Varying Annotations for Face Alignment via Deep Regression Network},
booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
month = {December},
year = {2015}
}