Fast-CLOCs: Fast Camera-LiDAR Object Candidates Fusion for 3D Object Detection

Su Pang, Daniel Morris, Hayder Radha; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2022, pp. 187-196

Abstract


When compared to single modality approaches, fusion-based object detection methods often require more complex models to integrate heterogeneous sensor data, and use more GPU memory and computational resources. This is particularly true for camera-LiDAR based multimodal fusion, which may require three separate deep-learning networks and/or processing pipelines that are designated for the visual data, LiDAR data, and for some form of a fusion framework. In this paper, we propose Fast Camera-LiDAR Object Candidates (Fast-CLOCs) fusion network that can run high-accuracy fusion-based 3D object detection in near real-time. Fast-CLOCs operates on the output candidates before Non-Maximum Suppression (NMS) of any 3D detector, and adds a lightweight 3D detector-cued 2D image detector (3D-Q-2D) to extract visual features from the image domain to improve 3D detections significantly. The 3D detection candidates are shared with the proposed 3D-Q-2D image detector as proposals to reduce the network complexity drastically. The superior experimental results of our Fast-CLOCs on the challenging KITTI and nuScenes datasets illustrate that our Fast-CLOCs outperforms state-of-the-art fusion-based 3D object detection approaches.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Pang_2022_WACV, author = {Pang, Su and Morris, Daniel and Radha, Hayder}, title = {Fast-CLOCs: Fast Camera-LiDAR Object Candidates Fusion for 3D Object Detection}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2022}, pages = {187-196} }