Tracklet-based Explainable Video Anomaly Localization

Ashish Singh, Michael J. Jones, Erik G. Learned-Miller; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 3992-4001

Abstract


We take a scene-understanding approach to video anomaly localization (VAL) that leverages the rapid progress that has been made in training general deep networks for object detection object recognition and optical flow. Our method uses each detected object's short-term trajectory appearance embedding size and location as its representation. These high-level attributes provide rich information about the object types and movements that are found in nominal video of a scene. By efficiently comparing the high-level attributes of test objects to those of normal objects our method detects anomalous objects and anomalous movements. In addition the human-understandable attributes used by our method can provide intuitive explanations for its decisions. We evaluate our method on many standard VAL datasets (USCD Ped1/Ped2 CUHK Avenue ShanghaiTech and Street Scene) using spatio-temporal evaluation criteria and demonstrate new state-of-the-art accuracy.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Singh_2024_CVPR, author = {Singh, Ashish and Jones, Michael J. and Learned-Miller, Erik G.}, title = {Tracklet-based Explainable Video Anomaly Localization}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {3992-4001} }