3D Hand Pose Estimation from RGB Using Privileged Learning with Depth Data

Shanxin Yuan, Bjorn Stenger, Tae-Kyun Kim; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 0-0

Abstract


This paper proposes a method for 3D hand pose estimation given a large dataset of depth images with joint annotations, and a smaller dataset of depth and RGB image pairs with joint annotations. We explore different ways of using the depth data at the training stage to improve the pose estimation accuracy of a network that only takes RGB images as input. By using paired RGB and depth images, we are able to supervise the RGB-based network to learn middle layer features that mimic that of a network trained on largescale, accurately annotated depth data. Further, depth data provides accurate foreground masks, which are employed to learn better feature activations in the RGB network. During testing, when only RGB images are available, our method produces accurate 3D hand pose predictions. The method is also shown to perform well on the 2D hand pose estimation task. We validate the approach on three public datasets, and compare it to other published methods.

Related Material


[pdf]
[bibtex]
@InProceedings{Yuan_2019_ICCV,
author = {Yuan, Shanxin and Stenger, Bjorn and Kim, Tae-Kyun},
title = {3D Hand Pose Estimation from RGB Using Privileged Learning with Depth Data},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops},
month = {Oct},
year = {2019}
}