- [pdf] [arXiv]
Self-Supervised Monocular Depth Estimation by Direction-aware Cumulative Convolution Network
Monocular depth estimation is known as an ill-posed task that objects in a 2D image usually do not contain sufficient information to predict their depth. Thus, it acts differently from other tasks (e.g., classification and segmentation) in many ways. In this paper, we find that self-supervised monocular depth estimation shows a direction sensitivity and environmental dependency in the feature representation. But the current CNN backbones borrowed from other tasks cannot handle different types of environmental information efficiently, limiting the overall depth accuracy. To bridge this gap, we propose a new Direction-aware Cumulative Convolution Network (DaCCN), which improves the depth feature representation in two aspects. First, we propose a direction-aware module, which can learn to adjust the feature extraction in each direction, facilitating the encoding of different types of information. Secondly, we design a new cumulative convolution to improve the efficiency for aggregating important environmental information. Experiments show that our method achieves significant improvements on three widely used benchmarks and sets a new state-of-the-art performance on the popular benchmarks with all three types of self-supervision.