-
[pdf]
[arXiv]
[bibtex]@InProceedings{Godbole_2025_ICCV, author = {Godbole, Mihir and Gao, Xiangbo and Tu, Zhengzhong}, title = {DRAMA-X: A Fine-grained Intent Prediction and Risk Reasoning Benchmark For Driving}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops}, month = {October}, year = {2025}, pages = {826-831} }
DRAMA-X: A Fine-grained Intent Prediction and Risk Reasoning Benchmark For Driving
Abstract
Understanding the short-term motion of vulnerable road users (VRUs) is critical for safe autonomous driving, especially in high-risk urban scenarios. While vision-language models (VLMs) have enabled open-vocabulary perception, their utility for fine-grained intent reasoning remains underexplored. Notably, no existing benchmark evaluates multi-class intent prediction in safety-critical situations. To address this gap, we introduce DRAMA-X, a fine-grained benchmark constructed from the DRAMA dataset via an automated annotation pipeline. DRAMA-X contains 5,686 accident-prone frames labeled with object bounding boxes, a nine-class directional intent taxonomy, binary risk scores, and expert-generated action suggestions. As a reference baseline, we propose SGG-Intent, a lightweight, training-free framework that mirrors an ego vehicle (the autonomous vehicle under consideration)'s's reasoning pipeline by sequentially generating a scene graph, inferring intent, and assessing risk. Our experiments demonstrate that scene-graph-based reasoning enhances intent prediction and risk assessment, and reveal that precise object localization is a critical bottleneck for current VLMs. Our code and dataset are available at: https://github.com/taco-group/DRAMA-X.
Related Material
