ACTIVE: Activity Concept Transitions in Video Event Classification

Chen Sun, Ram Nevatia; Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2013, pp. 913-920

Abstract


The goal of high level event classification from videos is to assign a single, high level event label to each query video. Traditional approaches represent each video as a set of low level features and encode it into a fixed length feature vector (e.g. Bag-of-Words), which leave a big gap between low level visual features and high level events. Our paper tries to address this problem by exploiting activity concept transitions in video events (ACTIVE). A video is treated as a sequence of short clips, all of which are observations corresponding to latent activity concept variables in a Hidden Markov Model (HMM). We propose to apply Fisher Kernel techniques so that the concept transitions over time can be encoded into a compact and fixed length feature vector very efficiently. Our approach can utilize concept annotations from independent datasets, and works well even with a very small number of training samples. Experiments on the challenging NIST TRECVID Multimedia Event Detection (MED) dataset shows our approach performs favorably over the state-of-the-art.

Related Material


[pdf]
[bibtex]
@InProceedings{Sun_2013_ICCV,
author = {Sun, Chen and Nevatia, Ram},
title = {ACTIVE: Activity Concept Transitions in Video Event Classification},
booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
month = {December},
year = {2013}
}