Probabilistic Knowledge Distillation of Face Ensembles

Xu, Jianqing; Li, Shen; Deng, Ailin; Xiong, Miao; Wu, Jiaying; Wu, Jiaxiang; Ding, Shouhong; Hooi, Bryan

Jianqing Xu, Shen Li, Ailin Deng, Miao Xiong, Jiaying Wu, Jiaxiang Wu, Shouhong Ding, Bryan Hooi; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 3489-3498

Abstract

Mean ensemble (i.e. averaging predictions from multiple models) is a commonly-used technique in machine learning that improves the performance of each individual model. We formalize it as feature alignment for ensemble in open-set face recognition and generalize it into Bayesian Ensemble Averaging (BEA) through the lens of probabilistic modeling. This generalization brings up two practical benefits that existing methods could not provide: (1) the uncertainty of a face image can be evaluated and further decomposed into aleatoric uncertainty and epistemic uncertainty, the latter of which can be used as a measure for out-of-distribution detection of faceness; (2) a BEA statistic provably reflects the aleatoric uncertainty of a face image, acting as a measure for face image quality to improve recognition performance. To inherit the uncertainty estimation capability from BEA without the loss of inference efficiency, we propose BEA-KD, a student model to distill knowledge from BEA. BEA-KD mimics the overall behavior of ensemble members and consistently outperforms SOTA knowledge distillation methods on various challenging benchmarks.

Related Material

[pdf] [supp]

[bibtex]

@InProceedings{Xu_2023_CVPR, author = {Xu, Jianqing and Li, Shen and Deng, Ailin and Xiong, Miao and Wu, Jiaying and Wu, Jiaxiang and Ding, Shouhong and Hooi, Bryan}, title = {Probabilistic Knowledge Distillation of Face Ensembles}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2023}, pages = {3489-3498} }