Towards Inadequately Pre-trained Models in Transfer Learning

Andong Deng, Xingjian Li, Di Hu, Tianyang Wang, Haoyi Xiong, Cheng-Zhong Xu; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 19397-19408

Abstract


Transfer learning has been a popular learning paradigm in the deep learning era, especially in annotation-insufficient scenarios. Better ImageNet pre-trained models have been demonstrated, from the perspective of architecture, by previous research to have better transferability to downstream tasks. However, in this paper, we find that during the same pre-training process, models at middle epochs, which are inadequately pre-trained, can outperform fully trained models when used as feature extractors (FE), while the fine-tuning (FT) performance still grows with the source performance. This reveals that there is not a solid positive correlation between top-1 accuracy on ImageNet and the transferring result on target data. Based on the contradictory phenomenon between FE and FT that a better feature extractor fails to be fine-tuned better accordingly, we conduct comprehensive analyses on features before the softmax layer to provide insightful explanations. Our discoveries suggest that, during pre-training, models tend to first learn spectral components corresponding to large singular values and the residual components contribute more when fine-tuning.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Deng_2023_ICCV, author = {Deng, Andong and Li, Xingjian and Hu, Di and Wang, Tianyang and Xiong, Haoyi and Xu, Cheng-Zhong}, title = {Towards Inadequately Pre-trained Models in Transfer Learning}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2023}, pages = {19397-19408} }