Understanding 3D Object Articulation in Internet Videos

Shengyi Qian, Linyi Jin, Chris Rockwell, Siyi Chen, David F. Fouhey; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 1599-1609

Abstract


We propose to investigate detecting and characterizing the 3D planar articulation of objects from ordinary RGB videos. While seemingly easy for humans, this problem poses many challenges for computers. Our approach is based on a top-down detection system that finds planes that can be articulated. This approach is followed by optimizing for a 3D plane that explains a sequence of detected articulations. We show that this system can be trained on a combination of videos and 3D scan datasets. When tested on a dataset of challenging Internet videos and the Charades dataset, our approach obtains strong performance.

Related Material


[pdf] [arXiv]
[bibtex]
@InProceedings{Qian_2022_CVPR, author = {Qian, Shengyi and Jin, Linyi and Rockwell, Chris and Chen, Siyi and Fouhey, David F.}, title = {Understanding 3D Object Articulation in Internet Videos}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2022}, pages = {1599-1609} }