FoveaNet: Perspective-Aware Urban Scene Parsing

Xin Li, Zequn Jie, Wei Wang, Changsong Liu, Jimei Yang, Xiaohui Shen, Zhe Lin, Qiang Chen, Shuicheng Yan, Jiashi Feng; Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 784-792

Abstract


Parsing urban scene images is critical for self-driving. Most of current solutions employ generic image parsing models that treat all scales and locations in the images equally and do not consider the geometry property of car-captured urban scene images. Thus, they suffer from heterogeneous object scales caused by perspective projection of cameras on actual scenes and inevitably encounter parsing failures on distant objects as well as other boundary and recognition errors. In this work, we propose a new FoveaNet model to fully exploit the perspective geometry of scene images and address the common failures of generic parsing models. FoveaNet estimates the perspective geometry of a scene image through a convolutional network which integrates supportive evidence from contextual objects within the image. Based on the perspective geometry information, FoveaNet "undoes" the camera perspective projection--analyzing regions in the space of the actual scene, and thus provides much more reliable parsing results. Furthermore, to effectively address the recognition errors, FoveaNet introduces a new dense CRF model that takes the perspective geometry as a prior potential. We evaluate FoveaNet on two urban scene parsing datasets, Cityspaces and CamVid, which demonstrates that FoveaNet can outperform all the well-established baselines and provide new state-of-the-art performance.

Related Material


[pdf] [supp] [arXiv] [video]
[bibtex]
@InProceedings{Li_2017_ICCV,
author = {Li, Xin and Jie, Zequn and Wang, Wei and Liu, Changsong and Yang, Jimei and Shen, Xiaohui and Lin, Zhe and Chen, Qiang and Yan, Shuicheng and Feng, Jiashi},
title = {FoveaNet: Perspective-Aware Urban Scene Parsing},
booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
month = {Oct},
year = {2017}
}