HOIAnimator: Generating Text-prompt Human-object Animations using Novel Perceptive Diffusion Models

Song, Wenfeng; Zhang, Xinyu; Li, Shuai; Gao, Yang; Hao, Aimin; Hou, Xia; Chen, Chenglizhao; Li, Ning; Qin, Hong

Wenfeng Song, Xinyu Zhang, Shuai Li, Yang Gao, Aimin Hao, Xia Hou, Chenglizhao Chen, Ning Li, Hong Qin; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 811-820

Abstract

To date the quest to rapidly and effectively produce human-object interaction (HOI) animations directly from textual descriptions stands at the forefront of computer vision research. The underlying challenge demands both a discriminating interpretation of language and a comprehensive physics-centric model supporting real-world dynamics. To ameliorate this paper advocates HOIAnimator a novel and interactive diffusion model with perception ability and also ingeniously crafted to revolutionize the animation of complex interactions from linguistic narratives. The effectiveness of our model is anchored in two ground-breaking innovations: (1) Our Perceptive Diffusion Models (PDM) brings together two types of models: one focused on human movements and the other on objects. This combination allows for animations where humans and objects move in concert with each other making the overall motion more realistic. Additionally we propose a Perceptive Message Passing (PMP) mechanism to enhance the communication bridging the two models ensuring that the animations are smooth and unified; (2) We devise an Interaction Contact Field (ICF) a sophisticated model that implicitly captures the essence of HOIs. Beyond mere predictive contact points the ICF assesses the proximity of human and object to their respective environment informed by a probabilistic distribution of interactions learned throughout the denoising phase. Our comprehensive evaluation showcases HOIanimator's superior ability to produce dynamic context-aware animations that surpass existing benchmarks in text-driven animation synthesis.

Related Material

[pdf] [supp]

[bibtex]

@InProceedings{Song_2024_CVPR, author = {Song, Wenfeng and Zhang, Xinyu and Li, Shuai and Gao, Yang and Hao, Aimin and Hou, Xia and Chen, Chenglizhao and Li, Ning and Qin, Hong}, title = {HOIAnimator: Generating Text-prompt Human-object Animations using Novel Perceptive Diffusion Models}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {811-820} }