LASR: Learning Articulated Shape Reconstruction From a Monocular Video

Gengshan Yang, Deqing Sun, Varun Jampani, Daniel Vlasic, Forrester Cole, Huiwen Chang, Deva Ramanan, William T. Freeman, Ce Liu; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 15980-15989

Abstract


Remarkable progress has been made in 3D reconstruction of rigid structures from a video or a collection of images. However, it is still challenging to reconstruct nonrigid structures from RGB inputs, due to the under-constrained nature of this problem. While template-based approaches, such as parametric shape models, have achieved great success in terms of modeling the "closed world" of known object categories, their ability to handle the "open-world" of novel object categories and outlier shapes is still limited. In this work, we introduce a template-free approach for 3D shape learning from a single video. It adopts an analysis-by-synthesis strategy that forward-renders object silhouette, optical flow, and pixels intensities to compare against video observations, which generates gradients signals to adjust the camera, shape and motion parameters. Without relying on a category-specific shape template, our method faithfully reconstructs nonrigid 3D structures from videos of human, animals, and objects of unknown classes in the wild.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Yang_2021_CVPR, author = {Yang, Gengshan and Sun, Deqing and Jampani, Varun and Vlasic, Daniel and Cole, Forrester and Chang, Huiwen and Ramanan, Deva and Freeman, William T. and Liu, Ce}, title = {LASR: Learning Articulated Shape Reconstruction From a Monocular Video}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2021}, pages = {15980-15989} }