MonoNeRF: Learning a Generalizable Dynamic Radiance Field from Monocular Videos

Fengrui Tian, Shaoyi Du, Yueqi Duan; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 17903-17913

Abstract


In this paper, we target at the problem of learning a generalizable dynamic radiance field from monocular videos. Different from most existing NeRF methods that are based on multiple views, monocular videos only contain one view at each timestamp, thereby suffering from ambiguity along the view direction in estimating point features and scene flows. Previous studies such as DynNeRF disambiguate point features by positional encoding, which is not transferable and severely limits the generalization ability. As a result, these methods have to train one independent model for each scene and suffer from heavy computational costs when applying to increasing monocular videos in real-world applications. To address this, We propose MonoNeRF to simultaneously learn point features and scene flows with point trajectory and feature correspondence constraints across frames. More specifically, we learn an implicit velocity field to estimate point trajectory from temporal features with Neural ODE, which is followed by a flow-based feature aggregation module to obtain spatial features along the point trajectory. We jointly optimize temporal and spatial features in an end-to-end manner. Experiments show that our MonoNeRF is able to learn from multiple scenes and support new applications such as scene editing, unseen frame synthesis, and fast novel scene adaptation. Codes are available at https://github.com/tianfr/MonoNeRF

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Tian_2023_ICCV, author = {Tian, Fengrui and Du, Shaoyi and Duan, Yueqi}, title = {MonoNeRF: Learning a Generalizable Dynamic Radiance Field from Monocular Videos}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2023}, pages = {17903-17913} }