-
[pdf]
[bibtex]@InProceedings{Dinai_2024_ACCV, author = {Dinai, Yonatan and Raviv, Avraham and Harel, Nimrod and Kim, Donghoon and Goldin, Ishay and Zehngut, Niv}, title = {TAPS: Temporal Attention-based Pruning and Scaling for Efficient Video Action Recognition}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {December}, year = {2024}, pages = {3803-3818} }
TAPS: Temporal Attention-based Pruning and Scaling for Efficient Video Action Recognition
Abstract
Video neural networks are computationally expensive. For real-time applications they require significant compute resources that are lacking on edge devices. Various methods were proposed to reduce the computational load of neural networks. Among them, dynamic approaches adapt the network architecture, its weights or the input resolution to the content of the input. Our proposed approach, showcased on the task of video action recognition, allows to dynamically reduce computations for a wide range of video processing networks by utilizing the redundancy between frames and channels. A per-layer lightweight policy network is used to make a per-filter decision regarding the filter's importance. Important filters are retained while others are scaled down or entirely skipped. Our method is the first to allow the policy network to gain a broader temporal context considering features aggregated over time. Temporal aggregation is done using self-attention between present, past and future (if available) input tensor descriptors. As demonstrated on a large variety of leading benchmarks such as Something-Something-V2, Mini-Kinetics, Jester and ActivityNet1.3, and over multiple network architectures, our method is able to enhance accuracy or save up to 70% of the FLOPs with no accuracy degradation, outperforming existing dynamic pruning methods by a large margin and setting a new bar for the accuracy-efficiency trade-off allowed by dynamic methods. We release the code and trained models at: https://github.com/tapsdyn/TAPS.
Related Material