Quality Diversity for Visual Pre-Training

Ruchika Chavhan, Henry Gouk, Da Li, Timothy Hospedales; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 5384-5394


Models pre-trained on large datasets such as ImageNet provide the de-facto standard for transfer learning, with both supervised and self-supervised approaches proving effective. However, emerging evidence suggests that any single pre-trained feature will not perform well on diverse downstream tasks. Each pre-training strategy encodes a certain inductive bias, which may suit some downstream tasks but not others. Notably, the augmentations used in both supervised and self-supervised training lead to features with high invariance to spatial and appearance transformations. This renders them sub-optimal for tasks that demand sensitivity to these factors. In this paper we develop a feature that better supports diverse downstream tasks by providing a diverse set of sensitivities and invariances. In particular, we are inspired by Quality-Diversity in evolution, to define a pre-training objective that requires high quality yet diverse features -- where diversity is defined in terms of transformation (in)variances. Our framework plugs in to both supervised and self-supervised pre-training, and produces a small ensemble of features. We further show how downstream tasks can easily and efficiently select their preferred (in)variances. Both empirical and theoretical analysis show the efficacy of our representation and transfer learning approach for diverse downstream tasks.

Related Material

[pdf] [supp]
@InProceedings{Chavhan_2023_ICCV, author = {Chavhan, Ruchika and Gouk, Henry and Li, Da and Hospedales, Timothy}, title = {Quality Diversity for Visual Pre-Training}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2023}, pages = {5384-5394} }