Consistent 3D Human Shape From Repeatable Action

Keisuke Shibata, Sangeun Lee, Shohei Nobuhara, Ko Nishino; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2021, pp. 1770-1779


We introduce a novel method for reconstructing the 3D human body from a video of a person in action. Our method recovers a single clothed body model that can explain all frames in the input. Our method builds on two key ideas: exploit the repeatability of human action and use the human body for camera calibration and anchoring. The input is a set of image sequences captured with a single camera at different viewpoints but of different instances of a repeatable action (e.g., batting). Detected 2D joints are used to calibrate the videos in space and time. The sparse viewpoints of the input videos are significantly increased by bone-anchored transformations into rest-pose. These virtually expanded calibrated camera views let us reconstruct surface points and free-form deform a mesh model to extract the frame-consistent personalized clothed body surface. In other words, we show how a casually taken video sequence can be converted into a calibrated dense multiview image set from which the 3D clothed body surface can be geometrically measured. We introduce two new datasets to validate the effectiveness of our method quantitatively and qualitatively and demonstrate free-viewpoint video playback.

Related Material

[pdf] [supp]
@InProceedings{Shibata_2021_CVPR, author = {Shibata, Keisuke and Lee, Sangeun and Nobuhara, Shohei and Nishino, Ko}, title = {Consistent 3D Human Shape From Repeatable Action}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2021}, pages = {1770-1779} }