-
[pdf]
[bibtex]@InProceedings{Cavicchini_2025_WACV, author = {Cavicchini, Davide and Pivotto, Alessia and Lorengo, Sofia and Rosani, Andrea and Garau, Nicola}, title = {CLaP - Contrast Label Predict: a quest for cheaper labeling in 3D human pose estimation}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV) Workshops}, month = {February}, year = {2025}, pages = {1276-1284} }
CLaP - Contrast Label Predict: a quest for cheaper labeling in 3D human pose estimation
Abstract
Human pose estimation (HPE) is a pivotal task in computer vision with applications spanning a wide range of domains such as sports analytics rehabilitation performance capture and many more. However obtaining labeled datasets for 3D pose estimation remains costly and resource intensive. To address this challenge we propose a novel pipeline that uses contrastive learning to reduce labeling requirements while maintaining adequate performance. Our method employs unsupervised fine-tuning of pre-trained ResNet backbones on unannotated multiview data acquired in a skiing scenario. The learned representations are then utilized to strategically select a minimal yet diverse subset of data for labeling which is subsequently used for supervised training. We demonstrate the effectiveness of this approach using three contrastive paradigms namely SimCLR MoCo and SimSiam evaluating their impact on data efficiency and model performance on the SkiPose dataset. Our results indicate that contrastive learning can significantly reduce labeling costs while retaining good pose estimation results making it a promising solution for resource-constrained applications.
Related Material