Student-Teacher Oneness: A Storage-Efficient Approach That Improves Facial Expression Recognition

Zhenzhu Zheng, Christopher Rasmussen, Xi Peng; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, 2021, pp. 4077-4086

Abstract


We present Student-Teacher Oneness (STO), a simple but effective approach for online knowledge distillation improves facial expression recognition, without introducing any extra model parameters. Stochastic sub-networks are designed to replace the multi-branch architecture component in current online distillation methods. This leads to a simplified architecture, and yet competitive performances. Under the "teacher-student"" framework, we construct both teacher and student within the same target network. Student network is the sub-networks which randomly skipping some portions of the full (target) network. The teacher network is the full network, can be considered as the ensemble of all possible student networks. The training process is performed in a closed-loop: (1) Forward prediction contains two passes that generate student and teacher predictions. (2) Backward distillation allows knowledge transfer from the teacher back to students. Comprehensive evaluations show that STO improves the generalization ability of a variety of deep neural networks to a significant margin. The results prove our superior performance in facial expression recognition task on FER-2013 and RAF.

Related Material


[pdf]
[bibtex]
@InProceedings{Zheng_2021_ICCV, author = {Zheng, Zhenzhu and Rasmussen, Christopher and Peng, Xi}, title = {Student-Teacher Oneness: A Storage-Efficient Approach That Improves Facial Expression Recognition}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops}, month = {October}, year = {2021}, pages = {4077-4086} }