Aerial Lifting: Neural Urban Semantic and Building Instance Lifting from Aerial Imagery

Yuqi Zhang, Guanying Chen, Jiaxing Chen, Shuguang Cui; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 21092-21103

Abstract


We present a neural radiance field method for urban-scale semantic and building-level instance segmentation from aerial images by lifting noisy 2D labels to 3D. This is a challenging problem due to two primary reasons. Firstly objects in urban aerial images exhibit substantial variations in size including buildings cars and roads which pose a significant challenge for accurate 2D segmentation. Secondly the 2D labels generated by existing segmentation methods suffer from the multi-view inconsistency problem especially in the case of aerial images where each image captures only a small portion of the entire scene. To overcome these limitations we first introduce a scale-adaptive semantic label fusion strategy that enhances the segmentation of objects of varying sizes by combining labels predicted from different altitudes harnessing the novel-view synthesis capabilities of NeRF. We then introduce a novel cross-view instance label grouping strategy based on the 3D scene representation to mitigate the multi-view inconsistency problem in the 2D instance labels. Furthermore we exploit multi-view reconstructed depth priors to improve the geometric quality of the reconstructed radiance field resulting in enhanced segmentation results. Experiments on multiple real-world urban-scale datasets demonstrate that our approach outperforms existing methods highlighting its effectiveness. The source code is available at https://github.com/zyqz97/Aerial_lifting.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Zhang_2024_CVPR, author = {Zhang, Yuqi and Chen, Guanying and Chen, Jiaxing and Cui, Shuguang}, title = {Aerial Lifting: Neural Urban Semantic and Building Instance Lifting from Aerial Imagery}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {21092-21103} }