RangeIoUDet: Range Image Based Real-Time 3D Object Detector Optimized by Intersection Over Union
Real-time and high-performance 3D object detection is an attractive research direction in autonomous driving. Recent studies prefer point based or voxel based convolution for achieving high performance. However, these methods suffer from the unsatisfied efficiency or complex customized convolution, making them unsuitable for applications with real-time requirements. In this paper, we present an efficient and effective 3D object detection framework, named RangeIoUDet that uses the range image as input. Benefiting from the dense representation of the range image, RangeIoUDet is entirely constructed based on 2D convolution, making it possible to have a fast inference speed. This model learns pointwise features from the range image, which is then passed to a region proposal network for predicting 3D bounding boxes. We optimize the pointwise feature and the 3D box via the point-based IoU and box-based IoU supervision, respectively. The point-based IoU supervision is proposed to make the network better learn the implicit 3D information encoded in the range image. The 3D Hybrid GIoU loss is introduced to generate high-quality boxes while providing an accurate quality evaluation. Through the point-based IoU and the box-based IoU, RangeIoUDet outperforms all single-stage models on the KITTI dataset, while running at 45 FPS for inference. Experiments on the self-built dataset further prove its effectiveness on different LIDAR sensors and object categories.