Rules of the Road: Predicting Driving Behavior With a Convolutional Model of Semantic Interactions

Joey Hong, Benjamin Sapp, James Philbin; The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 8454-8462

Abstract


We focus on the problem of predicting future states of entities in complex, real-world driving scenarios. Previous research has approached this problem via low-level signals to predict short time horizons, and has not addressed how to leverage key assets relied upon heavily by industry self-driving systems: (1) large 3D perception efforts which provide highly accurate 3D states of agents with rich attributes, and (2) detailed and accurate semantic maps of the environment (lanes, traffic lights, crosswalks, etc). We present a unified representation which encodes such high-level semantic information in a spatial grid, allowing the use of deep convolutional models to fuse complex scene context. This enables learning entity-entity and entity-environment interactions with simple, feed-forward computations in each timestep within an overall temporal model of an agent's behavior. We propose different ways of modelling the future as a distribution over future states using standard supervised learning. We introduce a novel dataset providing industry-grade rich perception and semantic inputs, and empirically show we can effectively learn fundamentals of driving behavior.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Hong_2019_CVPR,
author = {Hong, Joey and Sapp, Benjamin and Philbin, James},
title = {Rules of the Road: Predicting Driving Behavior With a Convolutional Model of Semantic Interactions},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2019}
}