Video Summarization via Multi-View Representative Selection

Jingjing Meng, Suchen Wang, Hongxing Wang, Junsong Yuan, Yap-Peng Tan; Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 1189-1198

Abstract


Video contents are inherently heterogeneous. To exploit different feature modalities in a diverse video collection for video summarization, we propose to formulate the task as a multi-view representative selection problem. The goal is to select visual elements that are representative of a video consistently across different views (i.e., feature modalities). We present the multi-view sparse dictionary selection with centroid co-regularization (MSDS-CC), which optimizes the representative selection in each view, and enforces that the view-specific selections to be similar by regularizing them towards a consensus. It can be efficiently solved by an alternating minimizing optimization with the fast iterative shrinkage thresholding algorithm. MSDS-CC can also be applied to category-specific summarization by incorporating visual co-occurrence priors. Experiments on benchmark datasets validate its effectiveness in comparison with other video summarization and representative selection methods.

Related Material


[pdf]
[bibtex]
@InProceedings{Meng_2017_ICCV,
author = {Meng, Jingjing and Wang, Suchen and Wang, Hongxing and Yuan, Junsong and Tan, Yap-Peng},
title = {Video Summarization via Multi-View Representative Selection},
booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV) Workshops},
month = {Oct},
year = {2017}
}