Physical Property Understanding from Language-Embedded Feature Fields

Albert J. Zhai, Yuan Shen, Emily Y. Chen, Gloria X. Wang, Xinlei Wang, Sheng Wang, Kaiyu Guan, Shenlong Wang; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 28296-28305

Abstract


Can computers perceive the physical properties of objects solely through vision? Research in cognitive science and vision science has shown that humans excel at identifying materials and estimating their physical properties based purely on visual appearance. In this paper we present a novel approach for dense prediction of the physical properties of objects using a collection of images. Inspired by how humans reason about physics through vision we leverage large language models to propose candidate materials for each object. We then construct a language-embedded point cloud and estimate the physical properties of each 3D point using a zero-shot kernel regression approach. Our method is accurate annotation-free and applicable to any object in the open world. Experiments demonstrate the effectiveness of the proposed approach in various physical property reasoning tasks such as estimating the mass of common objects as well as other properties like friction and hardness.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Zhai_2024_CVPR, author = {Zhai, Albert J. and Shen, Yuan and Chen, Emily Y. and Wang, Gloria X. and Wang, Xinlei and Wang, Sheng and Guan, Kaiyu and Wang, Shenlong}, title = {Physical Property Understanding from Language-Embedded Feature Fields}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {28296-28305} }