The 2nd YouTube-8M Large-Scale Video Understanding Challenge

Lee, Joonseok; (Paul) Natsev, Apostol; Reade, Walter; Sukthankar, Rahul; Toderici, George

Joonseok Lee, Apostol (Paul) Natsev, Walter Reade, Rahul Sukthankar, George Toderici; Proceedings of the European Conference on Computer Vision (ECCV) Workshops, 2018, pp. 0-0

Abstract

We hosted the 2nd YouTube-8M Large-Scale Video Understanding Kaggle Challenge and Workshop at ECCV’18, with the task of classifying videos from frame-level and video-level audio-visual features. In this year’s challenge, we restricted the final model size to 1GB or less, encouraging participants to explore representation learning or better architecture, instead of heavy ensembles of multiple models. In this paper, we briefly introduce the YouTube-8M dataset and challenge task, followed by participants statistics and result analysis. We summarize proposed ideas by participants, including architectures, temporal aggregation methods, ensembling and distillation, data augmentation, and more.

Related Material

[pdf]

[bibtex]

@InProceedings{Lee_2018_ECCV_Workshops,
author = {Lee, Joonseok and (Paul) Natsev, Apostol and Reade, Walter and Sukthankar, Rahul and Toderici, George},
title = {The 2nd YouTube-8M Large-Scale Video Understanding Challenge},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV) Workshops},
month = {September},
year = {2018}
}