Semantic Stereo Matching With Pyramid Cost Volumes

Zhenyao Wu, Xinyi Wu, Xiaoping Zhang, Song Wang, Lili Ju; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 7484-7493


The accuracy of stereo matching has been greatly improved by using deep learning with convolutional neural networks. To further capture the details of disparity maps, in this paper, we propose a novel semantic stereo network named SSPCV-Net, which includes newly designed pyramid cost volumes for describing semantic and spatial information on multiple levels. The semantic features are inferred by a semantic segmentation subnetwork while the spatial features are derived by hierarchical spatial pooling. In the end, we design a 3D multi-cost aggregation module to integrate the extracted multilevel features and perform regression for accurate disparity maps. We conduct comprehensive experiments and comparisons with some recent stereo matching networks on Scene Flow, KITTI 2015 and 2012, and Cityscapes benchmark datasets, and the results show that the proposed SSPCV-Net significantly promotes the state-of-the-art stereo-matching performance.

Related Material

author = {Wu, Zhenyao and Wu, Xinyi and Zhang, Xiaoping and Wang, Song and Ju, Lili},
title = {Semantic Stereo Matching With Pyramid Cost Volumes},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2019}