The MTA Dataset for Multi-Target Multi-Camera Pedestrian Tracking by Weighted Distance Aggregation

Philipp Kohl, Andreas Specker, Arne Schumann, Jurgen Beyerer; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2020, pp. 1042-1043

Abstract


Existing multi target multi camera tracking (MTMCT) datasets are small in terms of the number of identities and video lengths. The creation of new real world datasets is hard as privacy has to be guaranteed and the labeling is tedious. Therefore in the scope of this work a mod for GTA V to record a MTMCT dataset has been developed which also has been used to record a simulated MTMCT dataset called Multi Camera Track Auto (MTA). The MTA dataset contains over 2400 identities, 6 cameras and a video length of over 100 minutes per camera. Additionally a MTMCT system has been implemented to be able to provide a baseline for the created dataset. The system's pipeline looks as follows: Person detection, person re-identification, single camera multi target tracking, track distance calculation, track association. The track distance calculation is a weighted sum of the following distances: A single camera time constraint, a multi camera time constraint using convex camera overlapping areas, an appearance feature distance, a homography matching with pairwise camera homographies and a linear prediction based on the velocity and the time difference of tracks. When using all partial distances, we were able to surpass the results of state-of-the-art single camera trackers by +20% IDF1 score.

Related Material


[pdf] [video]
[bibtex]
@InProceedings{Kohl_2020_CVPR_Workshops,
author = {Kohl, Philipp and Specker, Andreas and Schumann, Arne and Beyerer, Jurgen},
title = {The MTA Dataset for Multi-Target Multi-Camera Pedestrian Tracking by Weighted Distance Aggregation},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2020}
}