3D Implicit Transporter for Temporally Consistent Keypoint Discovery

Chengliang Zhong, Yuhang Zheng, Yupeng Zheng, Hao Zhao, Li Yi, Xiaodong Mu, Ling Wang, Pengfei Li, Guyue Zhou, Chao Yang, Xinliang Zhang, Jian Zhao; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 3869-3880

Abstract


Keypoint-based representation has proven advantageous in various visual and robotic tasks. However, the existing 2D and 3D methods for detecting keypoints mainly rely on geometric consistency to achieve spatial alignment, neglecting temporal consistency. To address this issue, the Transporter method was introduced for 2D data, which reconstructs the target frame from the source frame to incorporate both spatial and temporal information. However, the direct application of the Transporter to 3D point clouds is infeasible due to their structural differences from 2D images. Thus, we propose the first 3D version of the Transporter, which leverages hybrid 3D representation, cross attention, and implicit reconstruction. We apply this new learning system on 3D articulated objects/humans and show that learned keypoints are spatiotemporal consistent. Additionally, we propose a control policy that utilizes the learned keypoints for 3D object manipulation and demonstrate its superior performance. Our codes, data, and models will be made publicly available.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Zhong_2023_ICCV, author = {Zhong, Chengliang and Zheng, Yuhang and Zheng, Yupeng and Zhao, Hao and Yi, Li and Mu, Xiaodong and Wang, Ling and Li, Pengfei and Zhou, Guyue and Yang, Chao and Zhang, Xinliang and Zhao, Jian}, title = {3D Implicit Transporter for Temporally Consistent Keypoint Discovery}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2023}, pages = {3869-3880} }