Dancing with Still Images: Video Distillation via Static-Dynamic Disentanglement

Wang, Ziyu; Xu, Yue; Lu, Cewu; Li, Yong-Lu

Ziyu Wang, Yue Xu, Cewu Lu, Yong-Lu Li; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 6296-6304

Abstract

Recently dataset distillation has paved the way towards efficient machine learning especially for image datasets. However the distillation for videos characterized by an exclusive temporal dimension remains an underexplored domain. In this work we provide the first systematic study of video distillation and introduce a taxonomy to categorize temporal compression. Our investigation reveals that the temporal information is usually not well learned during distillation and the temporal dimension of synthetic data contributes little. The observations motivate our unified framework of disentangling the dynamic and static information in the videos. It first distills the videos into still images as static memory and then compensates the dynamic and motion information with a learnable dynamic memory block. Our method achieves state-of-the-art on video datasets at different scales with notably smaller memory storage budget. Our code is available at https://github.com/yuz1wan/video_distillation.

Related Material

[pdf] [supp] [arXiv]

[bibtex]

@InProceedings{Wang_2024_CVPR, author = {Wang, Ziyu and Xu, Yue and Lu, Cewu and Li, Yong-Lu}, title = {Dancing with Still Images: Video Distillation via Static-Dynamic Disentanglement}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {6296-6304} }