Robust 2D/3D Vehicle Parsing in Arbitrary Camera Views for CVIS

Hui Miao, Feixiang Lu, Zongdai Liu, Liangjun Zhang, Dinesh Manocha, Bin Zhou; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 15631-15640

Abstract


We present a novel approach to robustly detect and perceive vehicles in different camera views as part of a cooperative vehicle-infrastructure system (CVIS). Our formulation is designed for arbitrary camera views and makes no assumptions about intrinsic or extrinsic parameters. First, to deal with multi-view data scarcity, we propose a part-assisted novel view synthesis algorithm for data augmentation. We train a part-based texture inpainting network in a self-supervised manner. Then we render the textured model into the background image with the target 6-DoF pose. Second, to handle various camera parameters, we present a new method that produces dense mappings between image pixels and 3D points to perform robust 2D/3D vehicle parsing. Third, we build the first CVIS dataset for benchmarking, which annotates more than 1540 images (14017 instances) from real-world traffic scenarios. We combine these novel algorithms and datasets to develop a robust approach for 2D/3D vehicle parsing for CVIS. In practice, our approach outperforms SOTA methods on 2D detection, instance segmentation, and 6-DoF pose estimation by 3.8%, 4.3%, and 2.9%, respectively.

Related Material


[pdf]
[bibtex]
@InProceedings{Miao_2021_ICCV, author = {Miao, Hui and Lu, Feixiang and Liu, Zongdai and Zhang, Liangjun and Manocha, Dinesh and Zhou, Bin}, title = {Robust 2D/3D Vehicle Parsing in Arbitrary Camera Views for CVIS}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2021}, pages = {15631-15640} }