Similarity Measure between Two Gestures Using Triplets

Ravikiran Krishnan, Sudeep Sarkar; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2013, pp. 506-513


One of the dominant approaches to gesture recognition, especially when we have one or few samples per class, is to compute the time-warped distance between the two sequences and perform nearest-neighbor classification. In this work, we show that we get much better results if instead we consider the similarity of the pattern of frame-wise distances of these two sequences with a third (anchor) sequence from the modelbase. We refer to these distance pattern vectors as the warp vectors. If these warp vectors are similar, then so are the sequences; if not, they are dissimilar. At the algorithmic core we have two dynamic time warping processes, one to compute the warp vectors with the anchor sequences and the other to compare these warp vectors. We select the anchor sequence to be the one that minimizes the overall distance, i.e. the sequence with respect to which these two sequences are the most similar. We present results on a large dataset of 1500 RGBD sequences spanning 150 gesture classes, such as traffic signals, sign language, and every day actions, extracted from the ChaLearn Gesture Challenge dataset. We experimented with three different feature types: difference of frames, HOG and relational distributions. We found that there were improvements of 5%, 15%, and 7%, respectively, at 20% false alarm rate, over traditional two-sequence based timewarped distance.

Related Material

author = {Krishnan, Ravikiran and Sarkar, Sudeep},
title = {Similarity Measure between Two Gestures Using Triplets},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2013}