T-DEED: Temporal-Discriminability Enhancer Encoder-Decoder for Precise Event Spotting in Sports Videos

Artur Xarles, Sergio Escalera, Thomas B. Moeslund, Albert Clapés; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 3410-3419

Abstract


In this paper we introduce T-DEED a Temporal-Discriminability Enhancer Encoder-Decoder for Precise Event Spotting in sports videos. T-DEED addresses multiple challenges in the task including the need for discriminability among frame representations high output temporal resolution to maintain prediction precision and the necessity to capture information at different temporal scales to handle events with varying dynamics. It tackles these challenges through its specifically designed architecture featuring an encoder-decoder for leveraging multiple temporal scales and achieving high output temporal resolution along with temporal modules designed to increase token discriminability. Leveraging these characteristics T-DEED achieves SOTA performance on the FigureSkating and FineDiving datasets. Code is available at https://github.com/arturxe2/T-DEED.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Xarles_2024_CVPR, author = {Xarles, Artur and Escalera, Sergio and Moeslund, Thomas B. and Clap\'es, Albert}, title = {T-DEED: Temporal-Discriminability Enhancer Encoder-Decoder for Precise Event Spotting in Sports Videos}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {3410-3419} }