Neural Vision-Based Semantic 3D World Modeling

Sotirios Papadopoulos, Ioannis Mademlis, Ioannis Pitas; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Workshops, 2021, pp. 181-190


Scene geometry estimation and semantic segmentation using image/video data are two active machine learning/computer vision research topics. Given monocular or stereoscopic 3D images, depicted scene/object geometry in the form of depth maps can be successfully estimated, while modern Deep Neural Network (DNN) architectures can accurately predict semantic masks on an image. In several scenarios, both tasks are required at once, leading to a need for combined semantic 3D world mapping methods. In the wake of modern autonomous systems, DNNs that simultaneously handle both tasks have arisen, exploiting machine/deep learning to save up considerably on computational resources and enhance performance, as these tasks can mutually benefit from each other. A great application area is 3D road scene modeling and semantic segmentation, e.g., for an autonomous car to identify and localize in 3D space visible pavement regions (marked as "road") that are essential for autonomous car driving. Due to the significance of this field, this paper surveys the state-of-the-art DNN-based methods for scene geometry estimation, image semantic segmentation and joint inference of both.

Related Material

@InProceedings{Papadopoulos_2021_WACV, author = {Papadopoulos, Sotirios and Mademlis, Ioannis and Pitas, Ioannis}, title = {Neural Vision-Based Semantic 3D World Modeling}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Workshops}, month = {January}, year = {2021}, pages = {181-190} }