CG-HOI: Contact-Guided 3D Human-Object Interaction Generation

Christian Diller, Angela Dai; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 19888-19901

Abstract


We propose CG-HOI the first method to address the task of generating dynamic 3D human-object interactions (HOIs) from text. We model the motion of both human and object in an interdependent fashion as semantically rich human motion rarely happens in isolation without any interactions. Our key insight is that explicitly modeling contact between the human body surface and object geometry can be used as strong proxy guidance both during training and inference. Using this guidance to bridge human and object motion enables generating more realistic and physically plausible interaction sequences where the human body and corresponding object move in a coherent manner. Our method first learns to model human motion object motion and contact in a joint diffusion process inter-correlated through cross-attention. We then leverage this learned contact for guidance during inference to synthesize realistic and coherent HOIs. Extensive evaluation shows that our joint contact-based human-object interaction approach generates realistic and physically plausible sequences and we show two applications highlighting the capabilities of our method. Conditioned on a given object trajectory we can generate the corresponding human motion without re-training demonstrating strong human-object interdependency learning. Our approach is also flexible and can be applied to static real-world 3D scene scans.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Diller_2024_CVPR, author = {Diller, Christian and Dai, Angela}, title = {CG-HOI: Contact-Guided 3D Human-Object Interaction Generation}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {19888-19901} }