Analyzing Deep Neural Network's Transferability via Frechet Distance
Transfer learning has become the de facto practice to reuse a deep neural network (DNN) that is pre-trained with abundant training data in a source task to improve the model training on target tasks with smaller-scale training data. In this paper, we first investigate the correlation between the DNN's pre-training performance in the source task and their transfer results in the downstream tasks. We find that high performance of a pre-trained model does not necessarily imply high transferability. We then propose a metric, named Fr echet Pre-train Distance, to estimate the transferability of a deep neural network. By applying the proposed Fr echet Pre-train Distance, we are able to identify the optimal pre-trained checkpoint, and then achieve high transferability on downstream tasks. Finally, we investigate several factors impacting DNN's transferability including normalization, different networks and learning rates. The results consistently support our conclusions.