GATEPose: A Graph Attention Transformer Enhanced with Pose and Orientation Angles for Pedestrian Crossing Intention Prediction

Ali K. AlShami, Terrance E. Boult, Jugal Kalita; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Workshops, 2026, pp. 1810-1819

Abstract


Predicting pedestrian crossing intention is essential for improving safety in autonomous driving and Advanced Driver Assistance Systems (ADAS), particularly in complex urban environments. In this work, we introduce GATEPose, a lightweight model for pedestrian crossing intention prediction that leverages pose, bounding box, and orientation angle information. The model integrates pose features using a novel ST-GAN+ block, while bounding box and orientation angle streams are modeled in parallel using GRU and Conv1D modules. These multimodal features are then fused and processed by transformer encoders to capture spatiotemporal patterns. Incorporating orientation angles significantly improves performance over methods that rely only on pose and bounding boxes, as angle-based representations provide more stable cues for capturing subtle pedestrian motions under ego-vehicle movement. The proposed architecture effectively models pedestrian dynamics while maintaining low inference latency. We evaluate GATEPose on large-scale subsets of the JAAD and PIE benchmarks and introduce two new datasets, JAAD_pose and PIE_pose, containing 25K and 72K sequences, respectively, with high-quality pose annotations extracted using ViTPose and orientation angles computed using our method. Experimental results demonstrate that GATEPose achieves state-of-the-art performance across multiple evaluation metrics while operating with significantly lower inference latency than recent approaches.

Related Material


[pdf]
[bibtex]
@InProceedings{AlShami_2026_WACV, author = {AlShami, Ali K. and Boult, Terrance E. and Kalita, Jugal}, title = {GATEPose: A Graph Attention Transformer Enhanced with Pose and Orientation Angles for Pedestrian Crossing Intention Prediction}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Workshops}, month = {March}, year = {2026}, pages = {1810-1819} }