DVMNet: Computing Relative Pose for Unseen Objects Beyond Hypotheses

Chen Zhao, Tong Zhang, Zheng Dang, Mathieu Salzmann; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 20485-20495

Abstract


Determining the relative pose of an object between two images is pivotal to the success of generalizable object pose estimation. Existing approaches typically approximate the continuous pose representation with a large number of discrete pose hypotheses which incurs a computationally expensive process of scoring each hypothesis at test time. By contrast we present a Deep Voxel Matching Network (DVMNet) that eliminates the need for pose hypotheses and computes the relative object pose in a single pass. To this end we map the two input RGB images reference and query to their respective voxelized 3D representations. We then pass the resulting voxels through a pose estimation module where the voxels are aligned and the pose is computed in an end-to-end fashion by solving a least-squares problem. To enhance robustness we introduce a weighted closest voxel algorithm capable of mitigating the impact of noisy voxels. We conduct extensive experiments on the CO3D LINEMOD and Objaverse datasets demonstrating that our method delivers more accurate relative pose estimates for novel objects at a lower computational cost compared to state-of-the-art methods. Our code is released at: https://github.com/sailor-z/DVMNet.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Zhao_2024_CVPR, author = {Zhao, Chen and Zhang, Tong and Dang, Zheng and Salzmann, Mathieu}, title = {DVMNet: Computing Relative Pose for Unseen Objects Beyond Hypotheses}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {20485-20495} }