Weakly-Supervised Mesh-Convolutional Hand Reconstruction in the Wild

Dominik Kulon, Riza Alp Guler, Iasonas Kokkinos, Michael M. Bronstein, Stefanos Zafeiriou; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 4990-5000

Abstract


We introduce a simple and effective network architecture for monocular 3D hand pose estimation consisting of an image encoder followed by a mesh convolutional decoder that is trained through a direct 3D hand mesh reconstruction loss. We train our network by gathering a large-scale dataset of hand action in YouTube videos and use it as a source of weak supervision. Our weakly-supervised mesh convolutions-based system largely outperforms state-of-the-art methods, even halving the errors on the in the wild benchmark. The dataset and additional resources are available at https://arielai.com/mesh_hands.

Related Material


[pdf] [arXiv]
[bibtex]
@InProceedings{Kulon_2020_CVPR,
author = {Kulon, Dominik and Guler, Riza Alp and Kokkinos, Iasonas and Bronstein, Michael M. and Zafeiriou, Stefanos},
title = {Weakly-Supervised Mesh-Convolutional Hand Reconstruction in the Wild},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020}
}