NeRF as Pretraining at Scale: Generalizable 3D-Aware Semantic Representation Learning from View Prediction

Wenyan Cong, Hanxue Liang, Zhiwen Fan, Peihao Wang, Yifan Jiang, Dejia Xu, A. Cengiz Oztireli, Zhangyang Wang; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 2872-2882

Abstract


Cross-scene generalizable NeRF models which could directly synthesize novel views using several source views of unseen scenes are gaining prominence in the NeRF field. Discovering the potential signal of emerging capabilities in existing methods we draw a parallel between BERT's "drop-and-predict" Masked Language Model (MLM) pretraining and novel view synthesis (NVS) in generalizable NeRF. In this work we pioneer the scaling up of NVS as an effective pretraining strategy in a multi-view context. To bolster generalizability in pretraining we incorporate a large-scale minimally annotated dataset and proportionally increase the model size revealing a neural scaling law akin to that observed in BERT. We also introduce innovative hardness-aware training techniques to enhance robust feature learning. Our model named "NPS" demonstrates remarkable generalizability in both zero-shot and few-shot novel view synthesis. It further shows emergent capabilities in downstream tasks like few-shot multi-view semantic segmentation and depth estimation. Significantly NPS reduces the necessity of training separate models for each task underlining its versatility and efficiency. This approach sets a new precedent in the NeRF field and highlights the vast possibilities opened up by scaling up generalizable novel view synthesis.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Cong_2024_CVPR, author = {Cong, Wenyan and Liang, Hanxue and Fan, Zhiwen and Wang, Peihao and Jiang, Yifan and Xu, Dejia and Oztireli, A. Cengiz and Wang, Zhangyang}, title = {NeRF as Pretraining at Scale: Generalizable 3D-Aware Semantic Representation Learning from View Prediction}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {2872-2882} }