NewsNet: A Novel Dataset for Hierarchical Temporal Segmentation

Wu, Haoqian; Chen, Keyu; Liu, Haozhe; Zhuge, Mingchen; Li, Bing; Qiao, Ruizhi; Shu, Xiujun; Gan, Bei; Xu, Liangsheng; Ren, Bo; Xu, Mengmeng; Zhang, Wentian; Ramachandra, Raghavendra; Lin, Chia-Wen; Ghanem, Bernard

Haoqian Wu, Keyu Chen, Haozhe Liu, Mingchen Zhuge, Bing Li, Ruizhi Qiao, Xiujun Shu, Bei Gan, Liangsheng Xu, Bo Ren, Mengmeng Xu, Wentian Zhang, Raghavendra Ramachandra, Chia-Wen Lin, Bernard Ghanem; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 10669-10680

Abstract

Temporal video segmentation is the get-to-go automatic video analysis, which decomposes a long-form video into smaller components for the following-up understanding tasks. Recent works have studied several levels of granularity to segment a video, such as shot, event, and scene. Those segmentations can help compare the semantics in the corresponding scales, but lack a wider view of larger temporal spans, especially when the video is complex and structured. Therefore, we present two abstractive levels of temporal segmentations and study their hierarchy to the existing fine-grained levels. Accordingly, we collect NewsNet, the largest news video dataset consisting of 1,000 videos in over 900 hours, associated with several tasks for hierarchical temporal video segmentation. Each news video is a collection of stories on different topics, represented as aligned audio, visual, and textual data, along with extensive frame-wise annotations in four granularities. We assert that the study on NewsNet can advance the understanding of complex structured video and benefit more areas such as short-video creation, personalized advertisement, digital instruction, and education. Our dataset and code is publicly available at: https://github.com/NewsNet-Benchmark/NewsNet.

Related Material

[pdf] [supp]

[bibtex]

@InProceedings{Wu_2023_CVPR, author = {Wu, Haoqian and Chen, Keyu and Liu, Haozhe and Zhuge, Mingchen and Li, Bing and Qiao, Ruizhi and Shu, Xiujun and Gan, Bei and Xu, Liangsheng and Ren, Bo and Xu, Mengmeng and Zhang, Wentian and Ramachandra, Raghavendra and Lin, Chia-Wen and Ghanem, Bernard}, title = {NewsNet: A Novel Dataset for Hierarchical Temporal Segmentation}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2023}, pages = {10669-10680} }