-
[pdf]
[supp]
[bibtex]@InProceedings{Pan_2025_WACV, author = {Pan, Zhiyu and Zhong, Zhicheng and Guo, Wenxuan and Chen, Yifan and Feng, Jianjiang and Zhou, Jie}, title = {LiCamPose: Combining Multi-View LiDAR and RGB Cameras for Robust Single-Timestamp 3D Human Pose Estimation}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)}, month = {February}, year = {2025}, pages = {2484-2494} }
LiCamPose: Combining Multi-View LiDAR and RGB Cameras for Robust Single-Timestamp 3D Human Pose Estimation
Abstract
Several methods have been proposed to estimate 3D human pose from multi-view images achieving satisfactory performance on public datasets collected under relatively simple conditions. However there are limited approaches studying extracting 3D human skeletons from multimodal inputs such as RGB and point cloud data. To address this gap we introduce LiCamPose a pipeline that integrates multi-view RGB and sparse point cloud information to estimate robust 3D human poses via single timestamp. We demonstrate the effectiveness of the volumetric architecture in combining these modalities. Furthermore to circumvent the need for manually labeled 3D human pose annotations we develop a synthetic dataset generator for pretraining and design an unsupervised domain adaptation strategy to train a 3D human pose estimator without manual annotations. To validate the generalization capability of our method LiCamPose is evaluated on four datasets including two public datasets one synthetic dataset and one challenging self-collected dataset named BasketBall covering diverse scenarios. The results demonstrate that LiCamPose exhibits great generalization performance and significant application potential. The code generator and datasets are available at https://github.com/Yu-Yy/LiCamPose.
Related Material