A Robust Diffusion Modeling Framework for Radar Camera 3D Object Detection

Zizhang Wu, Yunzhe Wu, Xiaoquan Wang, Yuanzhu Gan, Jian Pu; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024, pp. 3282-3292

Abstract


Radar-camera 3D object detection aims at interacting radar signals with camera images for identifying objects of interest and localizing their corresponding 3D bounding boxes. To overcome the severe sparsity and ambiguity of radar signals, we propose a robust framework based on probabilistic denoising diffusion modeling. We design our framework to be easily implementable on different multi-view 3D detectors without the requirement of using LiDAR point clouds during either the training or inference. In specific, we first design our framework with a denoised radar-camera encoder via developing a lightweight denoising diffusion model with semantic embedding. Secondly, we develop the query denoising training into 3D space via introducing the reconstruction training at depth measurement for the transformer detection decoder. Our framework achieves new state-of-the-art performance on the nuScenes 3D detection benchmark but with few computational cost increases compared to the baseline detectors.

Related Material


[pdf]
[bibtex]
@InProceedings{Wu_2024_WACV, author = {Wu, Zizhang and Wu, Yunzhe and Wang, Xiaoquan and Gan, Yuanzhu and Pu, Jian}, title = {A Robust Diffusion Modeling Framework for Radar Camera 3D Object Detection}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2024}, pages = {3282-3292} }