Iterative Latent Refinement for Robust Non-Autoregressive Sign Language Production

Tuğçe Kızıltepe, Sümeyye Meryem Taşyürek, Hacer Yalim Keles; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, 2025, pp. 4942-4952

Abstract


Sign language serves as an essential mode of communication for deaf and hard-of-hearing individuals. However, its computational modeling remains underdeveloped due to data scarcity and the complex spatio-temporal nature of sign articulation. Existing sign language production (SLP) methods, particularly those relying on autoregressive decoding, often suffer from error accumulation and oversmoothing, leading to unnatural and semantically diluted outputs. While non-autoregressive models offer faster inference and improved robustness, they struggle to generate detailed and expressive finger movements. In this work, we introduce ILRSLP, a gloss-free, non-autoregressive framework that employs iterative refinement over a structured latent pose space to enhance articulation accuracy and semantic coherence. Unlike prior refinement-based approaches in translation, our method regresses into a continuous, high-dimensional latent space, learned via an articulator-wise disentangled autoencoder. This design enables latent space regularization using articulator-specific priors, promoting stable learning and diverse motion generation. Experiments on the PHOENIX14T and CSL-Daily datasets demonstrate the effectiveness of the proposed framework.

Related Material


[pdf]
[bibtex]
@InProceedings{Kiziltepe_2025_ICCV, author = {K{\i}z{\i}ltepe, Tu\u{g}\c{c}e and Ta\c{s}y\"urek, S\"umeyye Meryem and Keles, Hacer Yalim}, title = {Iterative Latent Refinement for Robust Non-Autoregressive Sign Language Production}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops}, month = {October}, year = {2025}, pages = {4942-4952} }