Diffusion-Based Continuous Sign Language Generation with Cluster-Specific Fine-Tuning and Motion-Adapted Transformer

Razieh Rastgoo, Kourosh Kiani, Sergio Escalera; Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR) Workshops, 2025, pp. 4088-4097

Abstract


We propose a method for Continuous Sign Language Generation (CSLG) from skeleton data, enhancing motion realism, personalization, noise robustness, and long-range dependency modeling. To achieve this, we introduce a Temporal Transformer with Causal Self-Attention to model sequential relationships in text embeddings, ensuring linguistic coherence by conditioning representations on past and present contexts. Causal masking further maintains temporal consistency between textual input and motion synthesis. Since signer information is unavailable during testing, we propose an Encoder to map input text to signer-specific style parameters. Additionally, Cluster-Specific Fine-Tuning (CSFT) personalizes sign generation by clustering signers based on articulation speed, gesture amplitude, and stylistic variations, enabling fine-tuned adaptation. Using a diffusion-based motion-adapted Transformer, we generate natural and fluid sign sequences by refining skeletal movements over time, reducing abrupt transitions, and ensuring smooth articulation. To enhance robustness against noise in skeleton-based motion data, we introduce an Adaptive Filtering Mechanism that dynamically integrates Kalman filtering, Fourier-based motion smoothing, and Savitzky-Golay filtering to stabilize skeletal trajectories and reduce pose estimation artifacts. Results on three datasets across nine evaluation metrics demonstrate that our approach produces high-fidelity, signer-adaptive, and context-aware sign language motion.

Related Material


[pdf]
[bibtex]
@InProceedings{Rastgoo_2025_CVPR, author = {Rastgoo, Razieh and Kiani, Kourosh and Escalera, Sergio}, title = {Diffusion-Based Continuous Sign Language Generation with Cluster-Specific Fine-Tuning and Motion-Adapted Transformer}, booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR) Workshops}, month = {June}, year = {2025}, pages = {4088-4097} }