Comparative Knowledge Distillation

Alex Tianyi Xu, Alex Wilf, Paul Pu Liang, Alexander Obolenskiy, Daniel Fried, Louis-Philippe Morency; Proceedings of the Winter Conference on Applications of Computer Vision (WACV), 2025, pp. 7690-7699

Abstract


In the era of large-scale pretrained models Knowledge Distillation (KD) serves an important role in transferring the wisdom of computationally-heavy teacher models to lightweight efficient student models while preserving performance. Yet KD settings often assume readily available access to teacher models capable of performing many inferences -- a notion increasingly at odds with the realities of costly large-scale models. Addressing this gap we study an important question: how KD algorithms fare as the number of teacher inferences decreases a setting we term Reduced-Teacher-Inference Knowledge Distillation (RTI-KD). We observe that the performance of prevalent KD techniques and state-of-the-art data augmentation strategies suffers considerably as the number of teacher inferences is reduced. One class of approaches termed "relational" knowledge distillation underperforms the rest yet we hypothesize that they hold promise for reduced dependency on teacher models because they can augment the effective dataset size without additional teacher calls. We find that a simple change -- performing high-dimensional comparisons instead of low-dimensional relations which we term "Comparative Knowledge Distillation" -- vaults performance well over existing KD approaches. We perform empirical evaluation across varied experimental settings and rigorous analysis to understand the learning outcomes of our method. All code is made publicly available.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Xu_2025_WACV, author = {Xu, Alex Tianyi and Wilf, Alex and Liang, Paul Pu and Obolenskiy, Alexander and Fried, Daniel and Morency, Louis-Philippe}, title = {Comparative Knowledge Distillation}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)}, month = {February}, year = {2025}, pages = {7690-7699} }