-
[pdf]
[bibtex]@InProceedings{Mineo_2025_ICCV, author = {Mineo, Raffaele and Sorrenti, Amelia and Caligiore, Gaia and Salanitri, Federica Proietto and Bellitto, Giovanni and Polikovsky, Senya and Fontana, Sabina and Ragonese, Egidio and Spampinato, Concetto and Palazzo, Simone}, title = {Text-Aligned Radar-Based Sign Language Recognition for Healthcare Communication}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops}, month = {October}, year = {2025}, pages = {4953-4961} }
Text-Aligned Radar-Based Sign Language Recognition for Healthcare Communication
Abstract
Sign language recognition (SLR) aims to bridge the communication gap between signers and non-signers through automated interpretation of signs. While most existing methods rely on RGB or depth video, we explore radar signals as a privacy-preserving alternative. We introduce TRACE (Text-Radar Alignment via Contrastive Embedding), a novel radar-based framework for SLR. TRACE extracts spatiotemporal embeddings from Range-Doppler Map (RDM) sequences using a 3D convolutional neural network (CNN). During training, it employs promptconditioned contrastive learning to align radar features with soft-prompted class embeddings from a frozen text encoder. At inference time, TRACE assigns to each radar sequence the label whose textual embedding is most similar to the visual representation. We evaluate our method on a radar-based Italian Sign Language (LIS) dataset, showing that it outperforms state-of-the-art baselines under standard classification metrics. The results and the ablation study confirm that joint training enhances embedding quality by guiding the CNN toward semantically meaningful features through text alignment, while learnable soft prompts yield more suitable textual representations.
Related Material
