Generalizable Single-View Object Pose Estimation by Two-Side Generating and Matching

Yujing Sun, Caiyi Sun, Yuan Liu, Yuexin Ma, Siu Ming Yiu; Proceedings of the Winter Conference on Applications of Computer Vision (WACV), 2025, pp. 545-556

Abstract


In this paper we present a novel generalizable object pose estimation method to determine the object pose using only one RGB image. Unlike traditional approaches that rely on instance-level object pose estimation and necessitate extensive training data our method offers generalization to unseen objects without extensive training operates with a single reference image of the object and eliminates the need for 3D object models or multiple views of the object. These characteristics are achieved by utilizing a diffusion model to generate novel-view images and conducting a two-sided matching on these generated images. Quantitative experiments demonstrate the superiority of our method over existing pose estimation techniques across both synthetic and real-world datasets. Remarkably our approach maintains strong performance even in scenarios with significant viewpoint changes highlighting its robustness and versatility in challenging conditions. The code will be released at https://github.com/scy639/Gen2SM

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Sun_2025_WACV, author = {Sun, Yujing and Sun, Caiyi and Liu, Yuan and Ma, Yuexin and Yiu, Siu Ming}, title = {Generalizable Single-View Object Pose Estimation by Two-Side Generating and Matching}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)}, month = {February}, year = {2025}, pages = {545-556} }