AxIoU: An Axiomatically Justified Measure for Video Moment Retrieval

Riku Togashi, Mayu Otani, Yuta Nakashima, Esa Rahtu, Janne Heikkilä, Tetsuya Sakai; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 21076-21085

Abstract


Evaluation measures have a crucial impact on the direction of research. Therefore, it is of utmost importance to develop appropriate and reliable evaluation measures for new applications where conventional measures are not well suited. Video Moment Retrieval (VMR) is one such application, and the current practice is to use R@K,\theta for evaluating VMR systems. However, this measure has two disadvantages. First, it is rank-insensitive: It ignores the rank positions of successfully localised moments in the top-K ranked list by treating the list as a set. Second, it binarises the Intersection over Union (IoU) of each retrieved video moment using the threshold \theta and thereby ignoring fine-grained localisation quality of ranked moments. We propose an alternative measure for evaluating VMR, called Average Max IoU (AxIoU), which is free from the above two problems. We show that AxIoU satisfies two important axioms for VMR evaluation, namely, Invariance against Redundant Moments and Monotonicity with respect to the Best Moment, and also that R@K,\theta satisfies the first axiom only. We also empirically examine how AxIoU agrees with R@K,\theta, as well as its stability with respect to change in the test data and human-annotated temporal boundaries.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Togashi_2022_CVPR, author = {Togashi, Riku and Otani, Mayu and Nakashima, Yuta and Rahtu, Esa and Heikkil\"a, Janne and Sakai, Tetsuya}, title = {AxIoU: An Axiomatically Justified Measure for Video Moment Retrieval}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2022}, pages = {21076-21085} }