-
[pdf]
[bibtex]@InProceedings{Abbariki_2025_WACV, author = {Abbariki, Mahdi and Shoman, Maged}, title = {Interpreting the Unexpected: A Multimodal Framework for Out-of-Label Hazard Detection and Explanation in Autonomous Driving}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV) Workshops}, month = {February}, year = {2025}, pages = {669-676} }
Interpreting the Unexpected: A Multimodal Framework for Out-of-Label Hazard Detection and Explanation in Autonomous Driving
Abstract
Effective hazard detection and interpretation in autonomous driving extend well beyond standard pre-labeled categories of obstacles and events. While traditional systems excel at recognizing known classes (e.g. cars pedestrians traffic signs) they often struggle with unfamiliar scenarios that fall outside predefined labels. These "out-of-label" hazards including jaywalking pedestrians crossing animals unusual debris and unexpected driver maneuvers require a flexible and explanatory detection framework. This paper presents a multimodal solution that integrates depth estimation optical flow and annotated bounding box data to identify and characterize hazards in real-time. Unlike conventional methods our approach is not limited to predefined categories; it leverages depth maps and motion patterns to detect anomalies and evaluate driver behavior changes. Furthermore we incorporate a vision-language model to translate hazardous regions into natural language descriptions thereby translating complex visual cues into readily understandable narratives. Our method is validated on the COOOL Benchmark dataset which is designed specifically to test models on out-of-label hazards. By combining geometric understanding dynamic motion cues and descriptive language outputs this research contributes to safer autonomous driving environments where both vehicles and human stakeholders can understand and respond effectively to unpredictable real-world hazards.
Related Material