DepthNet: A Recurrent Neural Network Architecture for Monocular Depth Prediction

Arun CS Kumar, Suchendra M. Bhandarkar, Mukta Prasad; The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2018, pp. 283-291

Abstract


Predicting the depth map of a scene is often a vital component of monocular SLAM pipelines. Depth prediction is fundamentally ill-posed due to the inherent ambiguity in the scene formation process. In recent times, convolutional neural networks (CNNs) that exploit scene geometric constraints have been explored extensively for supervised single-view depth prediction and semi-supervised 2-view depth prediction. In this paper we explore whether recurrent neural networks (RNNs) can learn spatio-temporally accurate monocular depth prediction from video sequences, even without explicit definition of the inter-frame geometric consistency or pose supervision. To this end, we propose a novel convolutional LSTM (ConvLSTM)-based network architecture for depth prediction from a monocular video sequence. In the proposed ConvLSTM network architecture, we harness the ability of long short-term memory (LSTM)-based RNNs to reason sequentially and predict the depth map for an image frame as a function of the appearances of scene objects in the image frame as well as image frames in its temporal neighborhood. In addition, the proposed ConvLSTM network is also shown to be able to make depth predictions for future or unseen image frame(s). We demonstrate the depth prediction performance of the proposed ConvLSTM network on the KITTI dataset and show that it gives results that are superior in terms of accuracy to those obtained via depth-supervised and self-supervised methods and comparable to those generated by state-of-the-art pose-supervised methods.

Related Material


[pdf]
[bibtex]
@InProceedings{Kumar_2018_CVPR_Workshops,
author = {CS Kumar, Arun and Bhandarkar, Suchendra M. and Prasad, Mukta},
title = {DepthNet: A Recurrent Neural Network Architecture for Monocular Depth Prediction},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2018}
}