Cross Transferring Activity Recognition to Word Level Sign Language Detection

Srijith Radhakrishnan, Nikhil C Mohan, Manisimha Varma, Jaithra Varma, Smitha N Pai; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2022, pp. 2446-2453

Abstract


The lack of large scale labelled datasets in word-level sign language recognition (WSLR) poses a challenge to detecting sign language from videos. Most WSLR approaches operate on datasets that do not model real-world settings very well, as they do not have a high degree of variability in terms of signers, background, lighting and inter signer variation. We chose the MS-ASL dataset to overcome these limitations as they model open-world settings very well. This paper benchmarks successful action recognition architectures on the MS-ASL dataset using transfer learning. We have achieved new state-of-the-art accuracy (92.35%) with an improvement of 7.03% over the previous state-of-the-art introduced by the MS-ASL paper. We have analyzed how action-recognition architectures fair in the task of WSLR, and we propose SlowFast 8x8 ResNet 101 as a robust and suitable architecture for the task of WSLR.

Related Material


[pdf]
[bibtex]
@InProceedings{Radhakrishnan_2022_CVPR, author = {Radhakrishnan, Srijith and Mohan, Nikhil C and Varma, Manisimha and Varma, Jaithra and Pai, Smitha N}, title = {Cross Transferring Activity Recognition to Word Level Sign Language Detection}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2022}, pages = {2446-2453} }