Three-stage Training Pipeline with Patch Random Drop for Few-shot Object Detection

Shaobo Lin, Xingyu Zeng, Shilin Yan, Rui Zhao; Proceedings of the Asian Conference on Computer Vision (ACCV), 2022, pp. 1027-1043

Abstract


Self-supervised learning (SSL) aims to design pretext tasks for exploiting the structural information of data without manual annotation, which has been widely used in few-shot image classification for improving the generalization of the model. However, few works explore the influence of SSL on Few-shot object detection (FSOD) which is a more challenging task. Besides, our experimental results demonstrate that using a weighted sum of different self-supervised losses causes performance degradation compared to using a single self-supervised task in FSOD. To solve these problems, firstly, we introduce SSL into FSOD by applying SSL tasks to the cropped positive samples. Secondly, we propose a novel self-supervised method: patch random drop, for predicting the location of the masked image patch. Finally, we design a three-stage training pipeline to associate two different self-supervised tasks. Extensive experiments on the few-shot object detection datasets, i.e., Pascal VOC, MS COCO, validate the effectiveness of our method.

Related Material


[pdf]
[bibtex]
@InProceedings{Lin_2022_ACCV, author = {Lin, Shaobo and Zeng, Xingyu and Yan, Shilin and Zhao, Rui}, title = {Three-stage Training Pipeline with Patch Random Drop for Few-shot Object Detection}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {December}, year = {2022}, pages = {1027-1043} }