3D Object Detection from Consecutive Monocular Images

Chia-Chun Cheng, Shang-Hong Lai; Proceedings of the Asian Conference on Computer Vision (ACCV), 2020


Detecting objects in 3D space plays an important role in scene understanding, such as urban autonomous driving and mobile robot navigation. Many image-based methods are recently proposed due to the high cost of LiDAR. However, monocular images are lack of depth information and difficult to detect objects with occlusion. In this paper, we propose to integrate 2D/3D object detection and 3D motion estimation for consecutive monocular images to overcome these problems. Additionally, we estimate the relative motion of the object between frames to reconstruct the scene in the previous timestamp. Then, we can recover depth cues from multi-view geometric constraints. To learn motion estimation from unlabeled data, we propose an unsupervised motion loss which learns 3D motion estimation from consecutive images. Our experiments on KITTI dataset show that the proposed method outperforms the state-of-the-art methods for 3D Pedestrian and Cyclist detection and achieves competitive results for 3D Car detection.

Related Material

@InProceedings{Cheng_2020_ACCV, author = {Cheng, Chia-Chun and Lai, Shang-Hong}, title = {3D Object Detection from Consecutive Monocular Images}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {November}, year = {2020} }