End-to-End Detection and Pose Estimation of Two Interacting Hands

Dong Uk Kim, Kwang In Kim, Seungryul Baek; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 11189-11198

Abstract


Three dimensional hand pose estimation has reached a level of maturity, enabling real-world applications for single-hand cases. However, accurate estimation of the pose of two closely interacting hands still remains a challenge as in this case, one hand often occludes the other. We present a new algorithm that accurately estimates hand poses in such a challenging scenario. The crux of our algorithm lies in a framework that jointly trains the estimators of interacting hands, leveraging their inter-dependence. Further, we employ a GAN-type discriminator of interacting hand pose that helps avoid physically implausible configurations, e.g intersecting fingers, and exploit the visibility of joints to improve intermediate 2D pose estimation. We incorporate them into a single model that learns to detect hands and estimate their pose based on a unified criterion of pose estimation accuracy. To our knowledge, this is the first attempt to build an end-to-end network that detects and estimates the pose of two closely interacting hands (as well as single hands). In the experiments with three datasets representing challenging real-world scenarios, our algorithm demonstrated significant and consistent performance improvements over state-of-the-arts.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Kim_2021_ICCV, author = {Kim, Dong Uk and Kim, Kwang In and Baek, Seungryul}, title = {End-to-End Detection and Pose Estimation of Two Interacting Hands}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2021}, pages = {11189-11198} }