-
[pdf]
[bibtex]@InProceedings{Xu_2025_CVPR, author = {Xu, Yalong and Zhao, Lin and Gong, Chen and Li, Guangyu and Wang, Di and Wang, Nannan}, title = {DynPose: Largely Improving the Efficiency of Human Pose Estimation by a Simple Dynamic Framework}, booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)}, month = {June}, year = {2025}, pages = {1160-1169} }
DynPose: Largely Improving the Efficiency of Human Pose Estimation by a Simple Dynamic Framework
Abstract
Top-down approaches for human pose estimation (HPE) have reached a high level of sophistication, exemplified by models such as HRNet and ViTPose. Nonetheless, the low efficiency of top-down methods is a recognized issue that has not been sufficiently explored in current research. Our analysis suggests that the primary cause of inefficiency stems from the substantial diversity found in pose samples. On one hand, simple poses can be accurately estimated without requiring the computational resources of larger models. On the other hand, a more prominent issue arises from the abundance of bounding boxes, which remain excessive even after NMS. In this paper, we present a straightforward yet effective dynamic framework called DynPose, designed to match diverse pose samples with the most appropriate models, thereby ensuring optimal performance and high efficiency. Specifically, the framework contains a lightweight router and two pre-trained HPE models: one small and one large. The router is optimized to classify samples and dynamically determine the appropriate inference paths. Extensive experiments demonstrate the effectiveness of the framework. For example, using ResNet-50 and HRNet-W32 as the pre-trained models, our DynPose achieves an almost 50% increase in speed over HRNet-W32 while maintaining the same-level accuracy. More importantly, the framework can be generalized to other pre-trained models and datasets without re-training or fine-tuning. Code is available at https://github.com/Aritoria/DynPose.
Related Material