Point-to-Point Regression PointNet for 3D Hand Pose Estimation

Liuhao Ge, Zhou Ren, Junsong Yuan; The European Conference on Computer Vision (ECCV), 2018, pp. 475-491


Convolutional Neural Networks (CNNs)-based methods for 3D hand pose estimation with depth cameras usually take 2D depth images as input and directly regress holistic 3D hand pose. Different from these methods, our proposed Point-to-Point Regression PointNet directly takes the 3D point cloud as input and outputs point-wise estimations, i.e., heat-maps and unit vector fields on the point cloud, representing the closeness and direction from every point in the point cloud to the hand joint. The point-wise estimations are used to infer 3D joint locations with weighted fusion. To better capture 3D spatial information in the point cloud, we apply a stacked network architecture for PointNet with intermediate supervision, which is trained end-to-end. Experiments show that our method can achieve outstanding results when compared with state-of-the-art methods on three challenging hand pose datasets.

Related Material

author = {Ge, Liuhao and Ren, Zhou and Yuan, Junsong},
title = {Point-to-Point Regression PointNet for 3D Hand Pose Estimation},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}