MobileHumanPose: Toward Real-Time 3D Human Pose Estimation in Mobile Devices
Currently, 3D pose estimation methods are not compatible with a variety of low computational power devices because of efficiency and accuracy. In this paper, we revisit a pose estimation architecture from a viewpoint of both efficiency and accuracy. We propose a mobile-friendly model, MobileHumanPose, for real-time 3D human pose estimation from a single RGB image. This model consists of the modified MobileNetV2 backbone, a parametric activation function, and the skip concatenation inspired by U-Net. Especially, the skip concatenation structure improves accuracy by propagating richer features with negligible computational power. Our model achieves not only comparable performance to the state-of-the-art models but also has a seven times smaller model size compared to the ResNet-50 based model. In addition, our extra small model reduces inference time by 12.2ms on Galaxy S20 CPU, which is suitable for real-time 3D human pose estimation in mobile applications. The source code is available at: https://github.com/SangbumChoi/MobileHumanPose.