Text-to-Any-Skeleton Motion Generation Without Retargeting

Qingyuan Liu, Ke Lv, Kun Dong, Jian Xue, Zehai Niu, Jinbao Wang; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2025, pp. 12926-12936

Abstract


Recent advances in text-driven motion generation have shown notable advancements. However, these works are typically limited to standardized skeletons and rely on a cumbersome retargeting process to adapt to varying skeletal configurations of diverse characters. In this paper, we present OmniSkel, a novel framework that can directly generate high-quality human motions for any user-defined skeleton without retargeting. Specifically, we introduce skeleton-aware RVQ-VAE, which utilizes Kinematic Graph Cross Attention (K-GCA) to effectively integrate skeletal information into the motion encoding and reconstruction. Moreover, we propose a simple yet effective training-free approach, Motion Restoration Optimizer (MRO), to ensure zero bone length error while preserving motion smoothness. To facilitate our research, we construct SkeleMotion-3D, a large-scale text-skeleton-motion dataset based on HumanML3D. Extensive experiments demonstrate the excellent robustness and generalization of our method.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Liu_2025_ICCV, author = {Liu, Qingyuan and Lv, Ke and Dong, Kun and Xue, Jian and Niu, Zehai and Wang, Jinbao}, title = {Text-to-Any-Skeleton Motion Generation Without Retargeting}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2025}, pages = {12926-12936} }