Learning Local RGB-to-CAD Correspondences for Object Pose Estimation

Georgios Georgakis, Srikrishna Karanam, Ziyan Wu, Jana Kosecka; The IEEE International Conference on Computer Vision (ICCV), 2019, pp. 8967-8976


We consider the problem of 3D object pose estimation. While much recent work has focused on the RGB domain, the reliance on accurately annotated images limits generalizability and scalability. On the other hand, the easily available object CAD models are rich sources of data, providing a large number of synthetically rendered images. In this paper, we solve this key problem of existing methods requiring expensive 3D pose annotations by proposing a new method that matches RGB images to CAD models for object pose estimation. Our key innovations compared to existing work include removing the need for either real-world textures for CAD models or explicit 3D pose annotations for RGB images. We achieve this through a series of objectives that learn how to select keypoints and enforce viewpoint and modality invariance across RGB images and CAD model renderings. Our experiments demonstrate that the proposed method can reliably estimate object pose in RGB images and generalize to object instances not seen during training.

Related Material

author = {Georgakis, Georgios and Karanam, Srikrishna and Wu, Ziyan and Kosecka, Jana},
title = {Learning Local RGB-to-CAD Correspondences for Object Pose Estimation},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {October},
year = {2019}