MVFuseNet: Improving End-to-End Object Detection and Motion Forecasting Through Multi-View Fusion of LiDAR Data

Ankit Laddha, Shivam Gautam, Stefan Palombo, Shreyash Pandey, Carlos Vallespi-Gonzalez; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2021, pp. 2865-2874

Abstract


In this work, we propose MVFuseNet, a novel end-to-end method for joint object detection and motion forecasting from a temporal sequence of LiDAR data. Most existing methods operate in a singular view by projecting data in either range view (RV) or bird's eye view (BEV). In contrast, we propose a method to successfully leverage the complementary strengths of both views. We accomplish this by proposing a novel method to effectively utilize both RV and BEV for spatio-temporal feature learning as part of a temporal fusion network, as well as for multi-scale feature learning in the backbone network. Further, we propose a novel sequential fusion approach that effectively utilizes multiple views in the temporal fusion network. We show the benefits of our novel multi-view approach for the tasks of detection and motion forecasting on two large-scale self-driving data sets, achieving state-of-the-art results. Furthermore, we show the scalability of MVFuseNet with respect to increased operating range, by demonstrating real-time performance.

Related Material


[pdf] [arXiv]
[bibtex]
@InProceedings{Laddha_2021_CVPR, author = {Laddha, Ankit and Gautam, Shivam and Palombo, Stefan and Pandey, Shreyash and Vallespi-Gonzalez, Carlos}, title = {MVFuseNet: Improving End-to-End Object Detection and Motion Forecasting Through Multi-View Fusion of LiDAR Data}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2021}, pages = {2865-2874} }