Online Knowledge Distillation for Multi-Task Learning

Geethu Miriam Jacob, Vishal Agarwal, Björn Stenger; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023, pp. 2359-2368

Abstract


Multi-task learning (MTL) has found wide application in computer vision tasks. It uses a common backbone network allowing shared feature computation for different tasks such as semantic segmentation, depth- and normal estimation. In many cases negative transfer, i.e. impaired performance in the target domain, causes the MTL accuracy to be lower than simply training the corresponding single-task networks. To mitigate this issue, we propose an online knowledge distillation method for MTL, where single-task networks are trained simultaneously with the MTL network, guiding the optimization process. We propose selectively training layers for each task using an adaptive feature distillation (AFD) loss with an online task weighting (OTW) scheme. This task-wise feature distillation enables the MTL network to be trained in a similar way to the single-task networks. On the NYUv2 and Cityscapes datasets we show improvements over a baseline MTL model by 6.22% and 9.19%, respectively, and better performance than recent MTL methods. We validate our design choices, including the use of the online task weighting and the adaptive feature distillation loss in ablative experiments.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Jacob_2023_WACV, author = {Jacob, Geethu Miriam and Agarwal, Vishal and Stenger, Bj\"orn}, title = {Online Knowledge Distillation for Multi-Task Learning}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2023}, pages = {2359-2368} }