PhyScene: Physically Interactable 3D Scene Synthesis for Embodied AI

Yandan Yang, Baoxiong Jia, Peiyuan Zhi, Siyuan Huang; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 16262-16272

Abstract


With recent developments in Embodied Artificial Intelligence (EAI) research there has been a growing demand for high-quality large-scale interactive scene generation. While prior methods in scene synthesis have prioritized the naturalness and realism of the generated scenes the physical plausibility and interactivity of scenes have been largely left unexplored. To address this disparity we introduce PhyScene a novel method dedicated to generating interactive 3D scenes characterized by realistic layouts articulated objects and rich physical interactivity tailored for embodied agents. Based on a conditional diffusion model for capturing scene layouts we devise novel physics- and interactivity-based guidance mechanisms that integrate constraints from object collision room layout and object reachability. Through extensive experiments we demonstrate that PhyScene effectively leverages these guidance functions for physically interactable scene synthesis outperforming existing state-of-the-art scene synthesis methods by a large margin. Our findings suggest that the scenes generated by PhyScene hold considerable potential for facilitating diverse skill acquisition among agents within interactive environments thereby catalyzing further advancements in embodied AI research.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Yang_2024_CVPR, author = {Yang, Yandan and Jia, Baoxiong and Zhi, Peiyuan and Huang, Siyuan}, title = {PhyScene: Physically Interactable 3D Scene Synthesis for Embodied AI}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {16262-16272} }