DeepCap: Monocular Human Performance Capture Using Weak Supervision

Marc Habermann, Weipeng Xu, Michael Zollhofer, Gerard Pons-Moll, Christian Theobalt; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 5052-5063

Abstract


Human performance capture is a highly important computer vision problem with many applications in movie production and virtual/augmented reality. Many previous performance capture approaches either required expensive multi-view setups or did not recover dense space-time coherent geometry with frame-to-frame correspondences. We propose a novel deep learning approach for monocular dense human performance capture. Our method is trained in a weakly supervised manner based on multi-view supervision completely removing the need for training data with 3D ground truth annotations. The network architecture is based on two separate networks that disentangle the task into a pose estimation and a non-rigid surface deformation step. Extensive qualitative and quantitative evaluations show that our approach outperforms the state of the art in terms of quality and robustness.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Habermann_2020_CVPR,
author = {Habermann, Marc and Xu, Weipeng and Zollhofer, Michael and Pons-Moll, Gerard and Theobalt, Christian},
title = {DeepCap: Monocular Human Performance Capture Using Weak Supervision},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020}
}