-
[pdf]
[supp]
[arXiv]
[bibtex]@InProceedings{Hoang_2024_CVPR, author = {Hoang, Trung-Hieu and Zehni, Mona and Phan, Huy and Vo, Duc Minh and Do, Minh N.}, title = {Improving the Robustness of 3D Human Pose Estimation: A Benchmark and Learning from Noisy Input}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {113-123} }
Improving the Robustness of 3D Human Pose Estimation: A Benchmark and Learning from Noisy Input
Abstract
Despite the promising performance of current 3D human pose estimation techniques understanding and enhancing their robustness on challenging in-the-wild videos remain an open problem. In this work we focus on building robust 2D-to-3D pose lifters. To this end we develop two benchmark datasets namely Human3.6M-C and HumanEva-I-C to examine the resilience of video-based 3D pose lifters to a wide range of common video corruptions including temporary occlusion motion blur and pixel-level noise. We demonstrate the poor generalization of state-of-the-art 3D pose lifters in the presence of corruption and establish two techniques to tackle this issue. First we introduce Temporal Additive Gaussian Noise (TAGN) as a simple yet effective 2D input pose data augmentation. Additionally to incorporate the confidence scores output by the 2D pose detectors we design a confidence-aware convolution (CA-Conv) block. Extensively tested on corrupted videos the proposed strategies consistently boost the robustness of 3D pose lifters and serve as new baselines for future research.
Related Material