KEM: SGW-based Multi-Task Learning in Vision Tasks

Ruiyuan Zhang, Yuyao Chen, Jiaxiang Liu, Dianbing Xi, Yuchi Huo, Jie Liu, Chao Wu; Proceedings of the Asian Conference on Computer Vision (ACCV), 2024, pp. 1688-1705

Abstract


Multi-task-learning(MTL) is a multi-target optimization task. Neural networks try to realize each target using a shared interpretative space within MTL. However, as the scale of datasets expands and the complexity of tasks increases, knowledge sharing becomes increasingly challenging. In this paper, we first re-examine previous cross-attention MTL methods from the perspective of noise. We theoretically analyze this issue and identify it as a flaw in the cross-attention mechanism. To address this issue, we propose an information bottleneck knowledge extraction module (KEM). This module aims to reduce inter-task interference by constraining the flow of information, thereby reducing computational complexity. Furthermore, we have employed neural collapse to stabilize the knowledge-selection process. That is, before input to KEM, we projected the features into ETF space. This mapping makes our method more robust. We implemented and conducted comparative experiments with this method on multiple datasets. The results demonstrate that our approach significantly outperforms existing methods in multi-task learning. All code will be made publicly available upon the paper's acceptance.

Related Material


[pdf]
[bibtex]
@InProceedings{Zhang_2024_ACCV, author = {Zhang, Ruiyuan and Chen, Yuyao and Liu, Jiaxiang and Xi, Dianbing and Huo, Yuchi and Liu, Jie and Wu, Chao}, title = {KEM: SGW-based Multi-Task Learning in Vision Tasks}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {December}, year = {2024}, pages = {1688-1705} }