3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features

Chenfeng Xu, Huan Ling, Sanja Fidler, Or Litany; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 10617-10627

Abstract


3DiffTection introduces a novel method for 3D object detection from single images utilizing a 3D-aware diffusion model for feature extraction. Addressing the resource-intensive nature of annotating large-scale 3D image data our approach leverages pretrained diffusion models traditionally used for 2D tasks and adapts them for 3D detection through geometric and semantic tuning. Geometrically we enhance the model to perform view synthesis from single images incorporating an epipolar warp operator. This process utilizes easily accessible posed image data eliminating the need for manual annotation. Semantically the model is further refined on target detection data. Both stages utilize ControlNet ensuring the preservation of original feature capabilities. Through our methodology we obtain 3D-aware features that excel in identifying cross-view point correspondences. In 3D detection 3DiffTection substantially surpasses previous benchmarks e.g. Cube-RCNN by 9.43% in AP3D on the Omni3D-ARkitscene dataset. Furthermore 3DiffTection demonstrates robust label efficiency and generalizes well to cross-domain data nearly matching fully-supervised models in zero-shot scenarios.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Xu_2024_CVPR, author = {Xu, Chenfeng and Ling, Huan and Fidler, Sanja and Litany, Or}, title = {3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {10617-10627} }