Learning 2D Human Poses for Better 3D Lifting via Multi-Model 3D-Guidance

Lee, Sanghyeon; Hwang, Yoonho; Lee, Jong Taek

Sanghyeon Lee, Yoonho Hwang, Jong Taek Lee; Proceedings of the Asian Conference on Computer Vision (ACCV), 2024, pp. 3344-3361

Abstract

Recent advancements in 2D pose detectors have significantly improved 3D human pose estimation via the 2D-to-3D lifting approach. Despite these advancements, a substantial accuracy gap remains between using ground-truth 2D poses and detected 2D poses for 3D lifting. However, most methods focus solely on enhancing the 3D lifting network, using 2D pose detectors optimized for 2D accuracy without any refinement to better serve the 3D lifting process. To address this limitation, we propose a novel 3D-guided training method that leverages 3D loss to improve 2D pose estimation. Additionally, we introduce a multi-model training method to ensure robust generalization across various 3D lifting networks. Extensive experiments with three 2D pose detectors and four 3D lifting networks demonstrate our method's effectiveness. Our method achieves an average improvement of 4.6% in MPJPE on Human3.6M and 16.8% on Panoptic, enhancing 2D poses for accurate 3D lifting. The code is available at https://github.com/knu-vis/L2D-Pose.

Related Material

[pdf] [supp]

[bibtex]

@InProceedings{Lee_2024_ACCV, author = {Lee, Sanghyeon and Hwang, Yoonho and Lee, Jong Taek}, title = {Learning 2D Human Poses for Better 3D Lifting via Multi-Model 3D-Guidance}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {December}, year = {2024}, pages = {3344-3361} }