3D Human Pose Estimation From a Single Image via Distance Matrix Regression

Francesc Moreno-Noguer; The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 2823-2832

Abstract


This paper addresses the problem of 3D human pose estimation from a single image. We follow a standard two-step pipeline by first detecting the 2D position of the N body joints, and then using these observations to infer 3D pose. For the first step, we use a recent CNN-based detector. For the second step, most existing approaches perform 2N-to-3N regression of the Cartesian joint coordinates. We show that more precise pose estimates can be obtained by representing both the 2D and 3D human poses using NxN distance matrices, and formulating the problem as a 2D-to-3D distance matrix regression. For learning such a regressor we leverage on simple Neural Network architectures, which by construction, enforce positivity and symmetry of the predicted matrices. The approach has also the advantage to naturally handle missing observations and allowing to hypothesize the position of non-observed joints. Quantitative results on Humaneva and Human3.6M datasets demonstrate consistent performance gains over state-of-the-art. Qualitative evaluation on the images in-the-wild of the LSP dataset, using the regressor learned on Human3.6M, reveals very promising generalization results.

Related Material


[pdf] [arXiv]
[bibtex]
@InProceedings{Moreno-Noguer_2017_CVPR,
author = {Moreno-Noguer, Francesc},
title = {3D Human Pose Estimation From a Single Image via Distance Matrix Regression},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {July},
year = {2017}
}