Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving

Yuqi Wang, Jiawei He, Lue Fan, Hongxin Li, Yuntao Chen, Zhaoxiang Zhang; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 14749-14759

Abstract


In autonomous driving predicting future events in advance and evaluating the foreseeable risks empowers autonomous vehicles to plan their actions enhancing safety and efficiency on the road. To this end we propose Drive-WM the first driving world model compatible with existing end-to-end planning models. Through a joint spatial-temporal modeling facilitated by view factorization our model is the first to generate high-fidelity multiview videos. Building on its powerful generation ability we showcase the potential of applying the world model for safe driving planning for the first time. Our Drive-WM enables driving into multiple futures based on distinct driving maneuvers and determines the optimal trajectory according to the image-based rewards. Evaluation on real-world driving datasets verifies that our method could generate high-quality consistent and controllable multiview videos opening up possibilities for real-world simulations and safe planning.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Wang_2024_CVPR, author = {Wang, Yuqi and He, Jiawei and Fan, Lue and Li, Hongxin and Chen, Yuntao and Zhang, Zhaoxiang}, title = {Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {14749-14759} }