DISCOVER: Discovering Important Segments for Classification of Video Events and Recounting

Chen Sun, Ram Nevatia; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014, pp. 2569-2576

Abstract


We propose a unified framework DISCOVER to simultaneously discover important segments, classify high-level events and generate recounting for large amounts of unconstrained web videos. The motivation is our observation that many video events are characterized by certain important segments. Our goal is to find the important segments and capture their information for event classification and recounting. We introduce an evidence localization model where evidence locations are modeled as latent variables. We impose constraints on global video appearance, local evidence appearance and the temporal structure of the evidence. The model is learned via a max-margin framework and allows efficient inference. Our method does not require annotating sources of evidence, and is jointly optimized for event classification and recounting. Experimental results are shown on the challenging TRECVID 2013 MEDTest dataset.

Related Material


[pdf]
[bibtex]
@InProceedings{Sun_2014_CVPR,
author = {Sun, Chen and Nevatia, Ram},
title = {DISCOVER: Discovering Important Segments for Classification of Video Events and Recounting},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2014}
}