DIOR: DIstill Observations to Representations for Multi-Object Tracking and Segmentation

Jiarui Cai, Yizhou Wang, Hung-Min Hsu, Haotian Zhang, Jenq-Neng Hwang; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Workshops, 2022, pp. 520-529

Abstract


Multi-object tracking (MOT) has long been a crucial topic in the field of autonomous driving and security monitoring. With the saturation of the bounding-box-based MOT algorithms in recent years, a new task to track objects with instance segmentation, called multi-object tracking and segmentation (MOTS), provides a finer level of scene understanding and introduces potential improvements in tracking accuracy. In this paper, we introduce a video-based MOTS framework, named DIstill Observations to Representations (DIOR). A feature distiller is designed to extract and balance the comprehensive object representations: 1) the temporal distiller aggregates context information for consistency of features and smoothness of prediction longitudinally; 2) the spatial distiller on the target of interest within each bounding box removes ambiguity and irrelevance of background in the learned features. The subsequent tracking steps start with Hungarian matching based on feature similarity and masks continuity, which is efficient and straightforward. In addition, we propose short-term retrieval (STR) and long-term re-identification (re-ID) modules to avoid missing associations due to failures in detection or possible occlusion. Our method achieves state-of-the-art performance in both MOTS20 and KITTI-MOTS benchmarks.

Related Material


[pdf]
[bibtex]
@InProceedings{Cai_2022_WACV, author = {Cai, Jiarui and Wang, Yizhou and Hsu, Hung-Min and Zhang, Haotian and Hwang, Jenq-Neng}, title = {DIOR: DIstill Observations to Representations for Multi-Object Tracking and Segmentation}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Workshops}, month = {January}, year = {2022}, pages = {520-529} }