Multiple People Tracking by Lifted Multicut and Person Re-Identification

Siyu Tang, Mykhaylo Andriluka, Bjoern Andres, Bernt Schiele; The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 3539-3548


Tracking multiple persons in a monocular video of a crowded scene is a challenging task. Humans can master it even if they loose track of a person locally by re-identifying the same person based on their appearance. Care must be taken across long distances, as similar-looking persons need not be identical. In this work, we propose a novel graph-based formulation that links and clusters person hypotheses over time by solving an instance of a minimum cost lifted multicut problem. Our model generalizes previous works by introducing a mechanism for adding long-range attractive connections between nodes in the graph without modifying the original set of feasible solutions. This allows us to reward tracks that assign detections of similar appearance to the same person in a way that does not introduce implausible solutions. To effectively match hypotheses over longer temporal gaps we develop new deep architectures for re-identification of people. They combine holistic representations extracted with deep networks and body pose layout obtained with a state-of-the-art pose estimation model. We demonstrate the effectiveness of our formulation by reporting a new state-of-the-art for the MOT16 benchmark.

Related Material

author = {Tang, Siyu and Andriluka, Mykhaylo and Andres, Bjoern and Schiele, Bernt},
title = {Multiple People Tracking by Lifted Multicut and Person Re-Identification},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {July},
year = {2017}