Discriminative Appearance Modeling With Multi-Track Pooling for Real-Time Multi-Object Tracking

Chanho Kim, Li Fuxin, Mazen Alotaibi, James M. Rehg; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 9553-9562

Abstract


In multi-object tracking, the tracker maintains in its memory the appearance and motion information for each object in the scene. This memory is utilized for finding matches between tracks and detections, and is updated based on the matching. Many approaches model each target in isolation and lack the ability to use all the targets in the scene to jointly update the memory. This can be problematic when there are similarly looking objects in the scene. In this paper, we solve the problem of simultaneously considering all tracks during memory updating, with only a small spatial overhead, via a novel multi-track pooling module. We additionally propose a training strategy adapted to multi-track pooling which generates hard tracking episodes online. We show that the combination of these innovations results in a strong discriminative appearance model under the bilinear LSTM tracking framework, enabling the use of greedy data association to achieve online tracking performance. Our experiments demonstrate real-time, state-of-the-art online tracking performance on public multi-object tracking (MOT) datasets.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Kim_2021_CVPR, author = {Kim, Chanho and Fuxin, Li and Alotaibi, Mazen and Rehg, James M.}, title = {Discriminative Appearance Modeling With Multi-Track Pooling for Real-Time Multi-Object Tracking}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2021}, pages = {9553-9562} }