Limited Sampling Reference Frame for MaskTrack R-CNN

Zhuang Li, Leilei Cao, Hongbin Wang; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, 2021, pp. 3854-3857


With the great achievement for the computer vision tasks, e.g., image classification, object detection and segmenta- tion, people are diving into more complex vision tasks. Video instance segmentation is a new task which includes detection, segmentation and tracking of instances simulta- neously in a video. Occluded Video Instance Segmentation (OVIS) is used for this task, and it includes many heavily occluded scenes. Besides, there is a long range for the length of videos in this dataset. In order to track instances in videos with different lengths, we make some improvements based on MaskTrack R-CNN. Based on these optimizations, a refinement model can be well used to detect and segment instances, which acquires a better track accuracy in long videos. Furthermore, we apply Stochastic Weights Aver- aging training strategy to get a better result. Finally, The proposed method can achieve the mAP score of 28.9 for the validation set and 32.2 for the test set on the OVIS dataset.

Related Material

@InProceedings{Li_2021_ICCV, author = {Li, Zhuang and Cao, Leilei and Wang, Hongbin}, title = {Limited Sampling Reference Frame for MaskTrack R-CNN}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops}, month = {October}, year = {2021}, pages = {3854-3857} }