Three Steps to Multimodal Trajectory Prediction: Modality Clustering, Classification and Synthesis

Jianhua Sun, Yuxuan Li, Hao-Shu Fang, Cewu Lu; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 13250-13259

Abstract


Multimodal prediction results are essential for trajectory prediction task as there is no single correct answer for the future. Previous frameworks can be divided into three categories: regression, generation and classification frameworks. However, these frameworks have weaknesses in different aspects so that they cannot model the multimodal prediction task comprehensively. In this paper, we present a novel insight along with a brand-new prediction framework by formulating multimodal prediction into three steps: modality clustering, classification and synthesis, and address the shortcomings of earlier frameworks. Exhaustive experiments on popular benchmarks have demonstrated that our proposed method surpasses state-of-the-art works even without introducing social and map information. Specifically, we achieve 19.2% and 20.8% improvement on ADE and FDE respectively on ETH/UCY dataset.

Related Material


[pdf] [arXiv]
[bibtex]
@InProceedings{Sun_2021_ICCV, author = {Sun, Jianhua and Li, Yuxuan and Fang, Hao-Shu and Lu, Cewu}, title = {Three Steps to Multimodal Trajectory Prediction: Modality Clustering, Classification and Synthesis}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2021}, pages = {13250-13259} }