Hierarchical Light Transformer Ensembles for Multimodal Trajectory Forecasting

Lafage, Adrien; Barbier, Mathieu; Franchi, Gianni; Filliat, David

Adrien Lafage, Mathieu Barbier, Gianni Franchi, David Filliat; Proceedings of the Winter Conference on Applications of Computer Vision (WACV), 2025, pp. 1682-1691

Abstract

Accurate trajectory forecasting is crucial for the performance of various systems such as advanced driver-assistance systems and self-driving vehicles. These forecasts allow us to anticipate events that lead to collisions and therefore to mitigate them. Deep Neural Networks have excelled in motion forecasting but overconfidence and weak uncertainty quantification persist. Deep Ensembles address these concerns yet applying them to multimodal distributions remains challenging. In this paper we propose a novel approach named Hierarchical Light Transformer Ensembles (HLT-Ens) aimed at efficiently training an ensemble of Transformer architectures using a novel hierarchical loss function. HLT-Ens leverages grouped fully connected layers inspired by grouped convolution techniques to capture multimodal distributions effectively. We demonstrate that HLT-Ens achieves state-of-the-art performance levels through extensive experimentation offering a promising avenue for improving trajectory forecasting techniques. We make our code available at github.com/alafage/hlt-ens.

Related Material

[pdf] [supp] [arXiv]

[bibtex]

@InProceedings{Lafage_2025_WACV, author = {Lafage, Adrien and Barbier, Mathieu and Franchi, Gianni and Filliat, David}, title = {Hierarchical Light Transformer Ensembles for Multimodal Trajectory Forecasting}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)}, month = {February}, year = {2025}, pages = {1682-1691} }