Weight Copy and Low-Rank Adaptation for Few-Shot Distillation of Vision Transformers

Grigore, Diana-Nicoleta; Georgescu, Mariana-Iuliana; Justo, Jon Alvarez; Johansen, Tor; Ionescu, Andreea Iuliana; Ionescu, Radu Tudor

Diana-Nicoleta Grigore, Mariana-Iuliana Georgescu, Jon Alvarez Justo, Tor Johansen, Andreea Iuliana Ionescu, Radu Tudor Ionescu; Proceedings of the Winter Conference on Applications of Computer Vision (WACV), 2025, pp. 7368-7378

Abstract

Few-shot knowledge distillation recently emerged as a viable approach to harness the knowledge of large-scale pre-trained models using limited data and computational resources. In this paper we propose a novel few-shot feature distillation approach for vision transformers. Our approach is based on two key steps. Leveraging the fact that vision transformers have a consistent depth-wise structure we first copy the weights from intermittent layers of existing pre-trained vision transformers (teachers) into shallower architectures (students) where the intermittence factor controls the complexity of the student transformer with respect to its teacher. Next we employ an enhanced version of Low-Rank Adaptation (LoRA) to distill knowledge into the student in a few-shot scenario aiming to recover the information processing carried out by the skipped teacher layers. We present comprehensive experiments with supervised and self-supervised transformers as teachers on six data sets from various domains (natural medical and satellite images) and tasks (classification and segmentation). The empirical results confirm the superiority of our approach over state-of-the-art competitors. Moreover the ablation results demonstrate the usefulness of each component of the proposed pipeline. We release our code at https://github.com/dianagrigore/WeCoLoRA.

Related Material

[pdf] [supp] [arXiv]

[bibtex]

@InProceedings{Grigore_2025_WACV, author = {Grigore, Diana-Nicoleta and Georgescu, Mariana-Iuliana and Justo, Jon Alvarez and Johansen, Tor and Ionescu, Andreea Iuliana and Ionescu, Radu Tudor}, title = {Weight Copy and Low-Rank Adaptation for Few-Shot Distillation of Vision Transformers}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)}, month = {February}, year = {2025}, pages = {7368-7378} }