Towards Fair and Comprehensive Comparisons for Image-Based 3D Object Detection

Ma, Xinzhu; Wang, Yongtao; Zhang, Yinmin; Xia, Zhiyi; Meng, Yuan; Wang, Zhihui; Li, Haojie; Ouyang, Wanli

Xinzhu Ma, Yongtao Wang, Yinmin Zhang, Zhiyi Xia, Yuan Meng, Zhihui Wang, Haojie Li, Wanli Ouyang; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 6425-6435

Abstract

In this work, we build a modular-designed codebase, formulate strong training recipes, design an error diagnosis toolbox, and discuss current methods for image-based 3D object detection. Specifically, different from other highly mature tasks, e.g., 2D object detection, the community of image-based 3D object detection is still evolving, where methods often adopt different training recipes and tricks resulting in unfair evaluations and comparisons. What is worse, these tricks may overwhelm their proposed designs in performance, even leading to wrong conclusions. To address this issue, we build a module-designed codebase and formulate unified training standards for the community. Furthermore, we also design an error diagnosis toolbox to measure the detailed characterization of detection models. Using these tools, we analyze current methods in-depth under varying settings and provide discussions for some open questions, e.g., discrepancies in conclusions on KITTI-3D and nuScenes datasets, which have led to different dominant methods for these datasets. We hope that this work will facilitate future research in vision-based 3D detection. Our codes will be released at https://github.com/OpenGVLab/3dodi.

Related Material

[pdf] [supp]

[bibtex]

@InProceedings{Ma_2023_ICCV, author = {Ma, Xinzhu and Wang, Yongtao and Zhang, Yinmin and Xia, Zhiyi and Meng, Yuan and Wang, Zhihui and Li, Haojie and Ouyang, Wanli}, title = {Towards Fair and Comprehensive Comparisons for Image-Based 3D Object Detection}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2023}, pages = {6425-6435} }