PoseIRM: Enhance 3D Human Pose Estimation on Unseen Camera Settings via Invariant Risk Minimization

Yanlu Cai, Weizhong Zhang, Yuan Wu, Cheng Jin; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 2124-2133

Abstract


Camera-parameter-free multi-view pose estimation is an emerging technique for 3D human pose estimation (HPE). They can infer the camera settings implicitly or explicitly to mitigate the depth uncertainty impact showcasing significant potential in real applications. However due to the limited camera setting diversity in the available datasets the inferred camera parameters are always simply hardcoded into the model during training and not adaptable to the input in inference making the learned models cannot generalize well under unseen camera settings. A natural solution is to artificially synthesize some samples i.e. 2D-3D pose pairs under massive new camera settings. Unfortunately to prevent over-fitting the existing camera setting the number of synthesized samples for each new camera setting should be comparable with that for the existing one which multiplies the scale of training and even makes it computationally prohibitive. In this paper we propose a novel HPE approach under the invariant risk minimization (IRM) paradigm. Precisely we first synthesize 2D poses from myriad camera settings. We then train our model under the IRM paradigm which targets at learning a common optimal model across all camera settings and thus enforces the model to automatically learn the camera parameters based on the input data. This allows the model to accurately infer 3D poses on unseen data by training on only a handful of samples from each synthesized setting and thus avoid the unbearable training cost increment. Another appealing feature of our method is that benefited from the capability of IRM in identifying the invariant features its performance on the seen camera settings is enhanced as well. Comprehensive experiments verify the superiority of our approach.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Cai_2024_CVPR, author = {Cai, Yanlu and Zhang, Weizhong and Wu, Yuan and Jin, Cheng}, title = {PoseIRM: Enhance 3D Human Pose Estimation on Unseen Camera Settings via Invariant Risk Minimization}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {2124-2133} }