Self-Supervised Pretraining Improves Self-Supervised Pretraining

Colorado J Reed, Xiangyu Yue, Ani Nrusimha, Sayna Ebrahimi, Vivek Vijaykumar, Richard Mao, Bo Li, Shanghang Zhang, Devin Guillory, Sean Metzger, Kurt Keutzer, Trevor Darrell; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2022, pp. 2584-2594


While self-supervised pretraining has proven beneficial for many computer vision tasks, it requires expensive and lengthy computation, large amounts of data, and is sensitive to data augmentation. Prior work demonstrates that models pretrained on datasets dissimilar to their target data, such as chest X-ray models trained on ImageNet, underperform models trained from scratch. Users that lack the resources to pretrain must use existing models with lower performance. This paper explores Hierarchical PreTraining (HPT), which decreases convergence time and improves accuracy by initializing the pretraining process with an existing pretrained model. Through experimentation on 16 diverse vision datasets, we show HPT converges up to 80x faster, improves accuracy across tasks, and improves the robustness of the self-supervised pretraining process to changes in the image augmentation policy or amount of pretraining data. Taken together, HPT provides a simple framework for obtaining better pretrained representations with less computational resources.

Related Material

[pdf] [supp] [arXiv]
@InProceedings{Reed_2022_WACV, author = {Reed, Colorado J and Yue, Xiangyu and Nrusimha, Ani and Ebrahimi, Sayna and Vijaykumar, Vivek and Mao, Richard and Li, Bo and Zhang, Shanghang and Guillory, Devin and Metzger, Sean and Keutzer, Kurt and Darrell, Trevor}, title = {Self-Supervised Pretraining Improves Self-Supervised Pretraining}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2022}, pages = {2584-2594} }