Virtually Enriched NYU Depth V2 Dataset for Monocular Depth Estimation: Do We Need Artificial Augmentation?

Ignatov, Dmitry; Ignatov, Andrey; Timofte, Radu

Dmitry Ignatov, Andrey Ignatov, Radu Timofte; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 6177-6186

Abstract

We present ANYU a new virtually augmented version of the NYU depth v2 dataset designed for monocular depth estimation. In contrast to the well-known approach where full 3D scenes of a virtual world are utilized to generate artificial datasets ANYU was created by incorporating RGB-D representations of virtual reality objects into the original NYU depth v2 images. We specifically did not match each generated virtual object with an appropriate texture and a suitable location within the real-world image. Instead an assignment of texture location lighting and other rendering parameters was randomized to maximize a diversity of the training data and to show that it is randomness that can improve the generalizing ability of a dataset. By conducting extensive experiments with our virtually modified dataset and validating on the original NYU depth v2 and iBims-1 benchmarks we show that ANYU improves the monocular depth estimation performance and generalization of deep neural networks with considerably different architectures especially for the current state-of-the-art VPD model. To the best of our knowledge this is the first work that augments a real-world dataset with randomly generated virtual 3D objects for monocular depth estimation. We make our ANYU dataset publicly available in two training configurations with 10% and 100% additional synthetically enriched RGB-D pairs of training images respectively for efficient training and empirical exploration of virtual augmentation.

Related Material

[pdf] [supp] [arXiv]

[bibtex]

@InProceedings{Ignatov_2024_CVPR, author = {Ignatov, Dmitry and Ignatov, Andrey and Timofte, Radu}, title = {Virtually Enriched NYU Depth V2 Dataset for Monocular Depth Estimation: Do We Need Artificial Augmentation?}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {6177-6186} }