Spatio-temporal Saliency for Action Similarity

G.J. Burghouts, S.P. van den Broek, R.J.M. ten Hove; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2013, pp. 257-262


Human actions are spatio-temporal patterns. A popular representation is to describe the action by features at interest points. Because the interest point detection and feature description are generic processes, they are not tuned to discriminate one particular action from the other. In this paper we propose a saliency measure for each individual feature to improve its distinctiveness for a particular action. We propose a spatio-temporal saliency map, for a bag of features, that is specific to the current video and to the action of interest. The novelty is that the saliency map is derived directly from the SVM's support vectors. For the retrieval of 48 human actions from the database of 3,480 videos, we demonstrate a systematic improvement across the board of 35.3% on average and significant improvements for 25 actions. We learn that the improvements are achieved in particular for complex human actions such as giving, receiving, burying and replacing an item.

Related Material

author = {Burghouts, G.J. and van den Broek, S.P. and ten Hove, R.J.M.},
title = {Spatio-temporal Saliency for Action Similarity},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2013}