Joint Reconstruction of 3D Human and Object via Contact-Based Refinement Transformer

Hyeongjin Nam, Daniel Sungho Jung, Gyeongsik Moon, Kyoung Mu Lee; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 10218-10227

Abstract


Human-object contact serves as a strong cue to understand how humans physically interact with objects. Nevertheless it is not widely explored to utilize human-object contact information for the joint reconstruction of 3D human and object from a single image. In this work we present a novel joint 3D human-object reconstruction method (CONTHO) that effectively exploits contact information between humans and objects. There are two core designs in our system: 1) 3D-guided contact estimation and 2) contact-based 3D human and object refinement. First for accurate human-object contact estimation CONTHO initially reconstructs 3D humans and objects and utilizes them as explicit 3D guidance for contact estimation. Second to refine the initial reconstructions of 3D human and object we propose a novel contact-based refinement Transformer that effectively aggregates human features and object features based on the estimated human-object contact. The proposed contact-based refinement prevents the learning of erroneous correlation between human and object which enables accurate 3D reconstruction. As a result our CONTHO achieves state-of-the-art performance in both human-object contact estimation and joint reconstruction of 3D human and object. The codes are available in https://github.com/dqj5182/CONTHO_RELEASE.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Nam_2024_CVPR, author = {Nam, Hyeongjin and Jung, Daniel Sungho and Moon, Gyeongsik and Lee, Kyoung Mu}, title = {Joint Reconstruction of 3D Human and Object via Contact-Based Refinement Transformer}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {10218-10227} }