GSMNet: Towards Long-term Trajectory Prediction by Integrating Multi-Scale Information

Shaohua Liu, Yisu Wang, Yinglong Zhu, Pengfei Yao, Tianlu Mao, Zhaoqi Wang; Proceedings of the Asian Conference on Computer Vision (ACCV), 2024, pp. 2954-2969

Abstract


Predicting the future trajectories of pedestrians is a vital task for many applications, such as autonomous driving and robot navigation. Most existing methods only predict short-term trajectories. In this paper, we challenge the problem of long-term trajectory prediction. Different from short-term prediction which focus most on the local information, long-term prediction needs to model future trajectory with multi-scale information hierarchically from the multimodal global destination, to mid-distance scene layout limitation, other agent movement and finally the local history motion pattern. The destination reflects pedestrian long-term multimodal goal, the scene layout along with interaction constrains the possible path choice, and history motion pattern guides the future movement. We propose GSMNet, which achieves effective long-term trajectory prediction by integrating multi-scale factors: multimodal goals, scene interaction and motion patterns. We design separate modules to extract different scale features. Multi-layer-perceptron extracts the local-scale feature from history motion pattern. U-Net with attention captures the mid-scale pedestrian-scene correlation feature and goal feature with scene layout at global-scale. Finally, combining multi-scale feature to predict future trajectories. Experiments on SDD dataset and ETH-UCY dataset show that proposed GSMNet outperforms the previous state-of-the-art for both long-term and short-term trajectory prediction task. Qualitative results show GSMNet generates more reasonable trajectories.

Related Material


[pdf]
[bibtex]
@InProceedings{Liu_2024_ACCV, author = {Liu, Shaohua and Wang, Yisu and Zhu, Yinglong and Yao, Pengfei and Mao, Tianlu and Wang, Zhaoqi}, title = {GSMNet: Towards Long-term Trajectory Prediction by Integrating Multi-Scale Information}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {December}, year = {2024}, pages = {2954-2969} }