CenterPoint Transformer for BEV Object Detection with Automotive Radar

Loveneet Saini, Yu Su, Hasan Tercan, Tobias Meisen; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 4451-4460

Abstract


Object detection in Bird's Eye View (BEV) has emerged as a prevalent approach in automotive radar perception systems. Recent methods use Feature Pyramid Networks(FPNs) with large yet limited receptive fields to encode object properties. In contrast Detection Transformers (DETRs) known for their application in image-based object detection use a global receptive field and object queries with set losses. However applying DETRs to sparse radar inputs is challenging due to limited object definition resulting in inferior set matching. This paper addresses such limitations by introducing a novel approach that uses transformers to extract global context information and encode it into the object's center point. This approach aims to provide each object with individualized global context awareness to extract richer feature representations. Our experiments conducted on the public NuScenes dataset show a significant increase in mAP for the car category by 23.6% over the best radar-only submission alongside notable improvements for object detectors on the Aptiv dataset. Our modular architecture allows for easy integration of additional tasks providing benefits as evidenced by a reduction in the mean L2 error in velocity prediction across different classes.

Related Material


[pdf]
[bibtex]
@InProceedings{Saini_2024_CVPR, author = {Saini, Loveneet and Su, Yu and Tercan, Hasan and Meisen, Tobias}, title = {CenterPoint Transformer for BEV Object Detection with Automotive Radar}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {4451-4460} }