Tuning Stable Rank Shrinkage: Aiming at the Overlooked Structural Risk in Fine-tuning

Sicong Shen, Yang Zhou, Bingzheng Wei, Eric I-Chao Chang, Yan Xu; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 28474-28484

Abstract


Existing fine-tuning methods for computer vision tasks primarily focus on re-weighting the knowledge learned from the source domain during pre-training. They aim to retain beneficial knowledge for the target domain while suppressing unfavorable knowledge. During the pre-training and fine-tuning stages there is a notable disparity in the data scale. Consequently it is theoretically necessary to employ a model with reduced complexity to mitigate the potential structural risk. However our empirical investigation in this paper reveals that models fine-tuned using existing methods still manifest a high level of model complexity inherited from the pre-training stage leading to a suboptimal stability and generalization ability. This phenomenon indicates an issue that has been overlooked in fine-tuning: Structural Risk Minimization. To address this issue caused by data scale disparity during the fine-tuning stage we propose a simple yet effective approach called Tuning Stable Rank Shrinkage (TSRS). TSRS mitigates the structural risk during the fine-tuning stage by constraining the noise sensitivity of the target model based on stable rank theories. Through extensive experiments we demonstrate that incorporating TSRS into fine-tuning methods leads to improved generalization ability on various tasks regardless of whether the neural networks are based on convolution or transformer architectures. Additionally empirical analysis reveals that TSRS enhances the robustness convexity and smoothness of the loss landscapes in fine-tuned models. Code is available at https://github.com/WitGotFlg/TSRS.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Shen_2024_CVPR, author = {Shen, Sicong and Zhou, Yang and Wei, Bingzheng and Chang, Eric I-Chao and Xu, Yan}, title = {Tuning Stable Rank Shrinkage: Aiming at the Overlooked Structural Risk in Fine-tuning}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {28474-28484} }