Fingerspelling Recognition with Semi-Markov Conditional Random Fields

Taehwan Kim, Greg Shakhnarovich, Karen Livescu; Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2013, pp. 1521-1528


Recognition of gesture sequences is in general a very difficult problem, but in certain domains the difficulty may be mitigated by exploiting the domain's "grammar". One such grammatically constrained gesture sequence domain is sign language. In this paper we investigate the case of fingerspelling recognition, which can be very challenging due to the quick, small motions of the fingers. Most prior work on this task has assumed a closed vocabulary of fingerspelled words; here we study the more natural open-vocabulary case, where the only domain knowledge is the possible fingerspelled letters and statistics of their sequences. We develop a semi-Markov conditional model approach, where feature functions are defined over segments of video and their corresponding letter labels. We use classifiers of letters and linguistic handshape features, along with expected motion profiles, to define segmental feature functions. This approach improves letter error rate (Levenshtein distance between hypothesized and correct letter sequences) from 16.3% using a hidden Markov model baseline to 11.6% using the proposed semi-Markov model.

Related Material

author = {Kim, Taehwan and Shakhnarovich, Greg and Livescu, Karen},
title = {Fingerspelling Recognition with Semi-Markov Conditional Random Fields},
booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
month = {December},
year = {2013}