Benchmarking Data Efficiency and Computational Efficiency of Temporal Action Localization Models

Jan Warchocki, Teodor Oprescu, Yunhan Wang, Alexandru Dămăcuş, Paul Misterka, Robert-Jan Bruintjes, Attila Lengyel, Ombretta Strafforello, Jan van Gemert; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, 2023, pp. 3008-3016

Abstract


In temporal action localization, given an input video, the goal is to predict which actions it contains, where they begin, and where they end. Training and testing current state-of- the-art deep learning models requires access to large amounts of data and computational power. However, gathering such data is challenging and computational resources might be limited. This work explores and measures how current deep temporal action localization models perform in settings constrained by the amount of data or computational power. We measure data efficiency by training each model on a subset of the training set. We find that TemporalMaxer outperforms other models in data-limited settings. Furthermore, we recommend TriDet when training time is limited. To test the efficiency of the models during inference, we pass videos of different lengths through each model. We find that TemporalMaxer requires the least computational resources, likely due to its simple architecture.

Related Material


[pdf] [arXiv]
[bibtex]
@InProceedings{Warchocki_2023_ICCV, author = {Warchocki, Jan and Oprescu, Teodor and Wang, Yunhan and D\u{a}m\u{a}cu\c{s}, Alexandru and Misterka, Paul and Bruintjes, Robert-Jan and Lengyel, Attila and Strafforello, Ombretta and van Gemert, Jan}, title = {Benchmarking Data Efficiency and Computational Efficiency of Temporal Action Localization Models}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops}, month = {October}, year = {2023}, pages = {3008-3016} }