DGGAN: Depth-image Guided Generative Adversarial Networks for Disentangling RGB and Depth Images in 3D Hand Pose Estimation

Chen, Liangjian; Lin, Shih-Yao; Xie, Yusheng; Lin, Yen-Yu; Fan, Wei; Xie, Xiaohui

Liangjian Chen, Shih-Yao Lin, Yusheng Xie, Yen-Yu Lin, Wei Fan, Xiaohui Xie; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2020, pp. 411-419

Abstract

Estimating3D hand poses from RGB images is essentialto a wide range of potential applications, but is challengingowing to substantial ambiguity in the inference of depth in-formation from RGB images. State-of-the-art estimators ad-dress this problem by regularizing3D hand pose estimationmodels during training to enforce the consistency betweenthe predicted3D poses and the ground truth depth maps.However, these estimators rely on the availability of bothRGB images and paired depth maps during training. In thisstudy, we propose a conditional generative adversarial net-work model, called Depth-image Guided GAN (DGGAN),to generate realistic depth maps conditioned on the inputRGB image, and use the synthesized depth maps to regular-ize the3D hand pose estimation model, therefore eliminat-ing the need for ground truth depth maps. Experimental re-sults on multiple benchmark datasets show that the synthe-sized depth maps produced by DGGAN are quite effective inregularizing the pose estimation model, yielding new state-of-the-art results in estimation accuracy, notably reducingthe mean3D end-point errors (EPE) by4.7%,16.5%, and6.8%on the RHD, STB and MHP datasets, respectively.

Related Material

[pdf] [video]

[bibtex]

@InProceedings{Chen_2020_WACV,
author = {Chen, Liangjian and Lin, Shih-Yao and Xie, Yusheng and Lin, Yen-Yu and Fan, Wei and Xie, Xiaohui},
title = {DGGAN: Depth-image Guided Generative Adversarial Networks for Disentangling RGB and Depth Images in 3D Hand Pose Estimation},
booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
month = {March},
year = {2020}
}