Attributed Grammars for Joint Estimation of Human Attributes, Part and Pose

Seyoung Park, Song-Chun Zhu; The IEEE International Conference on Computer Vision (ICCV), 2015, pp. 2372-2380

Abstract


In this paper, we are interested in developing compositional models to explicit representing pose, parts and attributes and tackling the tasks of attribute recognition, pose estimation and part localization jointly. This is different from the recent trend of using CNN-based approaches for training and testing on these tasks separately with a large amount of data. Conventional attribute models typically use a large number of region-based attribute classifiers on parts of pre-trained pose estimator without explicitly detecting the object or its parts, or considering the correlations between attributes. In contrast, our approach jointly represents both the object parts and their semantic attributes within a unified compositional hierarchy. We apply our attributed grammar model to the task of human parsing by simultaneously performing part localization and attribute recognition. We show our modeling helps performance improvements on pose-estimation task and also outperforms on other existing methods on attribute prediction task.

Related Material


[pdf]
[bibtex]
@InProceedings{Park_2015_ICCV,
author = {Park, Seyoung and Zhu, Song-Chun},
title = {Attributed Grammars for Joint Estimation of Human Attributes, Part and Pose},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {December},
year = {2015}
}