Learning Where To Cut From Edited Videos

Yuzhong Huang, Xue Bai, Oliver Wang, Fabian Caba, Aseem Agarwala; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, 2021, pp. 3215-3223


In this work we propose a new approach for accelerating the video editing process by identifying good moments in time to cut unedited videos. We first validate that there is indeed a consensus among human viewers about good and bad cut moments with a user study, and then formulate this problem as a classification task. In order to train for such a task, we propose a self-supervised scheme that only requires pre-existing edited videos for training, of which there is large and diverse data readily available. We then propose a contrastive learning framework to train a 3D ResNet model to predict good regions to cut. We validate our method with a second user study, which indicates that clips generated by our model are preferred over a number of baselines.

Related Material

@InProceedings{Huang_2021_ICCV, author = {Huang, Yuzhong and Bai, Xue and Wang, Oliver and Caba, Fabian and Agarwala, Aseem}, title = {Learning Where To Cut From Edited Videos}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops}, month = {October}, year = {2021}, pages = {3215-3223} }