VideoSSL: Semi-Supervised Learning for Video Classification

Longlong Jing, Toufiq Parag, Zhe Wu, Yingli Tian, Hongcheng Wang; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2021, pp. 1110-1119


We propose a semi-supervised learning approach for video classification, VideoSSL, using convolutional neural networks (CNN). Like other computer vision tasks, existing supervised video classification methods demand a large amount of labeled data to attain good performance. However, annotation of a large dataset is expensive and time consuming. To minimize the dependence on a large annotated dataset, our proposed semi-supervised method trains from a small number of labeled examples and exploits two regulatory signals from unlabeled data. The first signal is the pseudo-labels of unlabeled examples computed from the confidences of the CNN being trained. The other is the normalized probabilities, as predicted by an image classifier CNN, that captures the information about appearances of the interesting objects in the video. We show that, under the supervision of these guiding signals from unlabeled examples, a video classification CNN can achieve impressive performances utilizing a small fraction of annotated examples on three publicly available datasets: UCF101, HMDB51 and Kinetics.

Related Material

[pdf] [arXiv]
@InProceedings{Jing_2021_WACV, author = {Jing, Longlong and Parag, Toufiq and Wu, Zhe and Tian, Yingli and Wang, Hongcheng}, title = {VideoSSL: Semi-Supervised Learning for Video Classification}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2021}, pages = {1110-1119} }