Multi-View Action Recognition for Distracted Driver Behavior Localization

Yuehuan Xu, Shuai Jiang, Zhe Cui, Fei Su; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 7172-7179

Abstract


The detection and recognition of distracted driving behaviors has emerged as a new vision task with the rapid development of computer vision which is considered as a challenging temporal action localization (TAL) problem in computer vision. The primary goal of temporal localization is to determine the start and end time of actions in untrimmed videos. Currently most state-of-the-art temporal localization methods adopt complex architectures which are cumbersome and time-consuming. In this paper we propose a robust and efficient two-stage framework for distracted behavior classification-localization based on the sliding window approach which is suitable for untrimmed naturalistic driving videos. To address the issues of high similarity among different behaviors and interference from background classes we propose a multi-view fusion and adaptive thresholding algorithm which effectively reduces missing detections. To address the problem of fuzzy behavior boundary localization we design a post-processing procedure that achieves fine localization from coarse localization through post connection and candidate behavior merging criteria. In the AICITY2024 Task3 TestA our method performs well achieving Average Intersection over Union(AIOU) of 0.6080 and ranking eighth in AICITY2024 Task3. Our code will be released in the near future.

Related Material


[pdf]
[bibtex]
@InProceedings{Xu_2024_CVPR, author = {Xu, Yuehuan and Jiang, Shuai and Cui, Zhe and Su, Fei}, title = {Multi-View Action Recognition for Distracted Driver Behavior Localization}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {7172-7179} }