PSMNet: Position-Aware Stereo Merging Network for Room Layout Estimation

Haiyan Wang, Will Hutchcroft, Yuguang Li, Zhiqiang Wan, Ivaylo Boyadzhiev, Yingli Tian, Sing Bing Kang; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 8616-8625

Abstract


In this paper, we propose a new deep learning-based method for estimating room layout given a pair of 360 panoramas. Our system, called Position-aware Stereo Merging Network or PSMNet, is an end-to-end joint layout-pose estimator. PSMNet consists of a Stereo Pano Pose (SP^2) transformer and a novel Cross-Perspective Projection (CP^2) layer. The stereo-view SP^2 transformer is used to implicitly infer correspondences between views, and can handle noisy poses. The pose-aware CP^2layer is designed to render features from the adjacent view to the anchor (reference) view, in order to perform view fusion and estimate the visible layout. Our experiments and analysis validate our method, which significantly outperforms the state-of-the-art layout estimators, especially for large and complex room spaces.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Wang_2022_CVPR, author = {Wang, Haiyan and Hutchcroft, Will and Li, Yuguang and Wan, Zhiqiang and Boyadzhiev, Ivaylo and Tian, Yingli and Kang, Sing Bing}, title = {PSMNet: Position-Aware Stereo Merging Network for Room Layout Estimation}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2022}, pages = {8616-8625} }