Using Language-Aligned Gesture Embeddings for Understanding Gestures Accompanying Math Terms

Tristan Maidment, Purav J Patel, Erin Walker, Adriana Kovashka; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 2227-2237

Abstract


In this paper we introduce an approach for recognizing and classifying gestures that accompany mathematical terms in a new collection we name the "GAMT" dataset. Our method uses language as a means of providing context to classify gestures. Specifically we use a CLIP-style framework to construct a shared embedding space for gestures and language experimenting with various methods for encoding gestures within this space. We evaluate our method on our new dataset containing a wide array of gestures associated with mathematical terms. The shared embedding space leads to a substantial improvement in gesture classification. Furthermore we identify an efficient model that excelled at classifying gestures from our unique dataset thus contributing to the further development of gesture recognition in diverse interaction scenarios.

Related Material


[pdf]
[bibtex]
@InProceedings{Maidment_2024_CVPR, author = {Maidment, Tristan and Patel, Purav J and Walker, Erin and Kovashka, Adriana}, title = {Using Language-Aligned Gesture Embeddings for Understanding Gestures Accompanying Math Terms}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {2227-2237} }