MobileHumanPose: Toward Real-Time 3D Human Pose Estimation in Mobile Devices

Sangbum Choi, Seokeon Choi, Changick Kim; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2021, pp. 2328-2338

Abstract


Currently, 3D pose estimation methods are not compatible with a variety of low computational power devices because of efficiency and accuracy. In this paper, we revisit a pose estimation architecture from a viewpoint of both efficiency and accuracy. We propose a mobile-friendly model, MobileHumanPose, for real-time 3D human pose estimation from a single RGB image. This model consists of the modified MobileNetV2 backbone, a parametric activation function, and the skip concatenation inspired by U-Net. Especially, the skip concatenation structure improves accuracy by propagating richer features with negligible computational power. Our model achieves not only comparable performance to the state-of-the-art models but also has a seven times smaller model size compared to the ResNet-50 based model. In addition, our extra small model reduces inference time by 12.2ms on Galaxy S20 CPU, which is suitable for real-time 3D human pose estimation in mobile applications. The source code is available at: https://github.com/SangbumChoi/MobileHumanPose.

Related Material


[pdf]
[bibtex]
@InProceedings{Choi_2021_CVPR, author = {Choi, Sangbum and Choi, Seokeon and Kim, Changick}, title = {MobileHumanPose: Toward Real-Time 3D Human Pose Estimation in Mobile Devices}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2021}, pages = {2328-2338} }