Rethinking the Heatmap Regression for Bottom-Up Human Pose Estimation

Zhengxiong Luo, Zhicheng Wang, Yan Huang, Liang Wang, Tieniu Tan, Erjin Zhou; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 13264-13273

Abstract


Heatmap regression has become the most prevalent choice for nowadays human pose estimation methods. The ground-truth heatmaps are usually constructed by covering all skeletal keypoints by 2D gaussian kernels. The standard deviations of these kernels are fixed. However, for bottom-up methods, which need to handle a large variance of human scales and labeling ambiguities, the current practice seems unreasonable. To better cope with these problems, we propose the scale-adaptive heatmap regression (SAHR) method, which can adaptively adjust the standard deviation for each keypoint. In this way, SAHR is more tolerant of various human scales and labeling ambiguities. However, SAHR may aggravate the imbalance between fore-background samples, which potentially hurts the improvement of SAHR. Thus, we further introduce the weight-adaptive heatmap regression (WAHR) to help balance the fore-background samples. Extensive experiments show that SAHR together with WAHR largely improves the accuracy of bottom-up human pose estimation. As a result, we finally outperform the state-of-the-art model by +1.5AP and achieve 72.0 AP on COCO test-dev2017, which is comparable with the performances of most top-down methods. Source codes are available at https://github.com/greatlog/SWAHR-HumanPose.

Related Material


[pdf] [arXiv]
[bibtex]
@InProceedings{Luo_2021_CVPR, author = {Luo, Zhengxiong and Wang, Zhicheng and Huang, Yan and Wang, Liang and Tan, Tieniu and Zhou, Erjin}, title = {Rethinking the Heatmap Regression for Bottom-Up Human Pose Estimation}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2021}, pages = {13264-13273} }