Shading Meets Motion: Self-supervised Indoor 3D Reconstruction Via Simultaneous Shape-from-Shading and Structure-from-Motion

Lu, Guoyu

Guoyu Lu; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025, pp. 16508-16519

Abstract

Scene reconstruction has a wide range of applications in computer vision and robotics. To build practical constraints and feature Scene reconstruction has a wide range of applications in computer vision and robotics. To build practical constraints and feature correspondences, rich textures and distinguished gradient variations are particularly required in classic and learning-based SfM. When building low-texture regions with repeated patterns, especially mostly-white indoor rooms, there is a significant drop in performance. In this work, we propose Shading-SfM-Net, a novel framework for simultaneously learning a shape-from-shading network based on the inverse rendering constraint and a structure-from-motion framework based on warped keypoint and geometric consistency, to improve structure-from-motion and surface reconstruction for low-texture indoor scenes. Shading-SfM-Net tightly incorporates the surface shape consistency and 3D geometric registration loss in order to dig into their mutual information and further overcome the instability on flat regions. We evaluate the proposed framework on texture-less indoor scenes (NYUv2 and ScanNet), and show that by simultaneously learning shading, motion and shape, our pipeline is able to achieve state-of-the-art performance with superior generalization capability for unseen texture-less datasets.

Related Material

[pdf]

[bibtex]

@InProceedings{Lu_2025_CVPR, author = {Lu, Guoyu}, title = {Shading Meets Motion: Self-supervised Indoor 3D Reconstruction Via Simultaneous Shape-from-Shading and Structure-from-Motion}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2025}, pages = {16508-16519} }