-
[pdf]
[supp]
[arXiv]
[bibtex]@InProceedings{Shangguan_2024_ACCV, author = {Shangguan, Zeyu and Huai, Lian and Liu, Tong and Liu, Yuyu and Jiang, Xingqun}, title = {Decoupled DETR For Few-shot Object Detection}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {December}, year = {2024}, pages = {286-302} }
Decoupled DETR For Few-shot Object Detection
Abstract
The efficient technique for dealing with severe data-hungry issues in object detection, known as Few-shot object detection (FSOD), has been widely explored. However, FSOD encounters some notable challenges such as the model's natural bias towards pre-training data and the inherent defects present in the existing models. In this paper, we introduce improved methods for the FSOD problem based on DETR structures: (i) To reduce bias from pre-training classes (i.e. many-shot base classes), we investigate the impact of decoupling the parameters of pre-training classes and fine-tuning classes (i.e. few-shot novel classes) in various ways. As a result, we propose a "base-novel categories decoupled DETR (DeDETR)" network for FSOD. (ii) To further improve the efficiency of the DETR's skip connection structure, we explore varied skip connection types in the DETR's encoder and decoder. Subsequently, we introduce a unified decoder module that dynamically blends decoder layers to generate the output feature. Our model's effectiveness is evaluated using PASCAL VOC and MSCOCO datasets. Our results indicate that our proposed module consistently improves performance by 5% to 10% in both fine-tuning and meta-learning frameworks and has surpassed the top scores achieved in recent studies.
Related Material