Posebits for Monocular Human Pose Estimation

Gerard Pons-Moll, David J. Fleet, Bodo Rosenhahn; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014, pp. 2337-2344


We advocate the inference of qualitative information about 3D human pose, called posebits, from images. Posebits represent boolean geometric relationships between body parts (e.g., left-leg in front of right-leg or hands close to each other). The advantages of posebits as a mid-level representation are 1) for many tasks of interest, such qualitative pose information may be sufficient (e.g., semantic image retrieval), 2) it is relatively easy to annotate large image corpora with posebits, as it simply requires answers to yes/no questions; and 3) they help resolve challenging pose ambiguities and therefore facilitate the difficult talk of image-based 3D pose estimation. We introduce posebits, a posebit database, a method for selecting useful posebits for pose estimation and a structural SVM model for posebit inference. Experiments show the use of posebits for semantic image retrieval and for improving 3D pose estimation.

Related Material

author = {Pons-Moll, Gerard and Fleet, David J. and Rosenhahn, Bodo},
title = {Posebits for Monocular Human Pose Estimation},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2014}