High-Fidelity Character Animation: Generating Coherent and Controllable Motion Videos from Static Images

Yongming Huang, Zhuojun Xia, Can Bu; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, 2025, pp. 6592-6601

Abstract


This paper focuses on video-driven character animation, which aims to generate temporally continuous and appearance-consistent role animation videos from static images using pose signals.Although diffusion models have become the mainstream solution, existing methods still face significant challenges in cross-frame detail consistency, motion controllability, and long-term temporal modeling. To address these limitations, we propose High-Fidelity Character Animation--a innovative diffusion-based video generation framework. First, we design an Identity Preservation Module to enhance the consistency of identity features in the process of video generation. Second, we propose Domain Specific Adaptation Optimization to enhance gener alization across diverse scenarios and complex poses.Fianlly, we develop a Spatio-temporal Optimization with Detail Enhancement to improve the representation of key region details and ensure temporal consistency across long video sequences. Experiments show that the performance of our model on two benchmark datasets (TikTok dataset and TED-talks dataset) achieves the state-of-the-art, exceeding the strongest baselines by 22.8% and 30.2% respectively in terms of video quality metrics.

Related Material


[pdf]
[bibtex]
@InProceedings{Huang_2025_ICCV, author = {Huang, Yongming and Xia, Zhuojun and Bu, Can}, title = {High-Fidelity Character Animation: Generating Coherent and Controllable Motion Videos from Static Images}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops}, month = {October}, year = {2025}, pages = {6592-6601} }