- [pdf] [supp]
Heterogeneous Diversity Driven Active Learning for Multi-Object Tracking
The existing one-stage multi-object tracking (MOT) algorithms have achieved satisfactory performance benefiting from a large amount of labeled data. However, acquiring plenty of laborious annotated frames is not practical in real applications. To reduce the cost of human annotations, we propose Heterogeneous Diversity driven Active Multi-Object Tracking (HD-AMOT), to infer the most informative frames for any MOT tracker by observing the heterogeneous cues of samples. HD-AMOT defines the diversified informative representation by encoding the geometric and semantic information, and formulates the frame inference strategy as a Markov decision process to learn an optimal sampling policy based on the designed informative representation. Specifically, HD-AMOT consists of a diversified informative representation module as well as an informative frame selection network. The former produces the signal characterizing the diversity and distribution of frames, and the latter receives the signal and conducts multi-frame cooperation to enable batch frame sampling. Extensive experiments conducted on the MOT15, MOT17, MOT20, and Dancetrack datasets demonstrate the efficacy and effectiveness of HD-AMOT. Experiments show that under 50% budget our HD-AMOT can achieve similar or even higher performance as fully-supervised learning.