ActiveDC: Distribution Calibration for Active Finetuning

Xu, Wenshuai; Hu, Zhenghui; Lu, Yu; Meng, Jinzhou; Liu, Qingjie; Wang, Yunhong

Wenshuai Xu, Zhenghui Hu, Yu Lu, Jinzhou Meng, Qingjie Liu, Yunhong Wang; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 16996-17005

Abstract

The pretraining-finetuning paradigm has gained popularity in various computer vision tasks. In this paradigm the emergence of active finetuning arises due to the abundance of large-scale data and costly annotation requirements. Active finetuning involves selecting a subset of data from an unlabeled pool for annotation facilitating subsequent finetuning. However the use of a limited number of training samples can lead to a biased distribution potentially resulting in model overfitting. In this paper we propose a new method called ActiveDC for the active finetuning tasks. Firstly we select samples for annotation by optimizing the distribution similarity between the subset to be selected and the entire unlabeled pool in continuous space. Secondly we calibrate the distribution of the selected samples by exploiting implicit category information in the unlabeled pool. The feature visualization provides an intuitive sense of the effectiveness of our approach to distribution calibration. We conducted extensive experiments on three image classification datasets with different sampling ratios. The results indicate that ActiveDC consistently outperforms the baseline performance in all image classification tasks. The improvement is particularly significant when the sampling ratio is low with performance gains of up to 10%. Our code will be released.

Related Material

[pdf] [supp] [arXiv]

[bibtex]

@InProceedings{Xu_2024_CVPR, author = {Xu, Wenshuai and Hu, Zhenghui and Lu, Yu and Meng, Jinzhou and Liu, Qingjie and Wang, Yunhong}, title = {ActiveDC: Distribution Calibration for Active Finetuning}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {16996-17005} }