-
[pdf]
[bibtex]@InProceedings{Korny_2026_CVPR, author = {Korny, Youssef and Yoo, Sunghwan and Panangian, Daniel and Bittner, Ksenia and Wichmann, Andreas and Sohn, Gunho}, title = {GeoPriorPC: Nadir-view to 3D Point Cloud Reconstruction for buildings via Two-Stage Diffusion Priors}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2026}, pages = {39-48} }
GeoPriorPC: Nadir-view to 3D Point Cloud Reconstruction for buildings via Two-Stage Diffusion Priors
Abstract
Reconstructing 3D building point clouds from single nadir aerial images is a challenging task due to the inherent out-of-plane ambiguity. Existing diffusion-based methods rely on explicit 2D spatial features, which often struggle to accurately capture intricate architectural details, precise facade angles, and correct building proportions from top-down views. We propose GeoPriorPC, a two-stage diffusion framework that leverages a learned latent geometric prior to guide 3D point cloud generation. In the first stage, we fine-tune a pre-trained latent diffusion model to predict oblique surface normals from nadir inputs. Operating directly in the compressed latent space of the normal map provides robust geometric features while avoiding the computational overhead of pixel-space decoding. In the second stage, a Diffusion Point Transformer (DiPT) reconstructs the 3D geometry using a dual-conditioning mechanism. This backbone integrates the latent normal prior globally via adaptive Layer Normalization (adaLN-Zero) to ensure stable training convergence, and locally through a cross-attention for fine-grained spatial alignment. Compared to state-of-the-art baselines, GeoPriorPC achieves a 44% relative improvement in F1-score and reduces the Earth Mover's Distance by nearly 49%, demonstrating a clear capacity to generate accurate and uniform 3D building structures. Code can be found here: https://github.com/youssef-shaban/GeoPriorPC
Related Material

