VIFA: An Efficient Visible and Infrared Image Fusion Architecture for Multi-task Applications via Continual Learning

Jiaxing Shi, Ao Ren, Wei Zhuang, Yang Hua, ZhiYong Qin, Zhenyu Wang, Yang Song, Yujuan Tan, Duo Liu; Proceedings of the Asian Conference on Computer Vision (ACCV), 2024, pp. 2872-2888

Abstract


Visible-infrared image fusion has attracted great attention in a range of computer vision applications. Aiming at improving task-specific performance, recent studies have employed a cascading approach, where the fusion network is trained using feedback from the specific downstream task network. However, this training strategy will result in the overfitting of the fusion network, and deploying a different fusion network for each downstream task is inefficient for multi-task scenarios. To address this challenge, we propose VIFA, a visible-infrared image fusion architecture for multi-task applications. This architecture effectively mitigates the catastrophic forgetting problem by partitioning the fusion network into a knowledge-sharing backbone and task-specific components. To facilitate knowledge sharing, we introduce a key channel-constrained distillation strategy, which identifies and retains informative features, while allowing non-critical channels to learn new knowledge. In addition, we propose a reference model-guided distillation to compress the task-specific components while maintaining model performance. Evaluations on multiple representative fusion networks show that VIFA can significantly improve task performance and speed.

Related Material


[pdf]
[bibtex]
@InProceedings{Shi_2024_ACCV, author = {Shi, Jiaxing and Ren, Ao and Zhuang, Wei and Hua, Yang and Qin, ZhiYong and Wang, Zhenyu and Song, Yang and Tan, Yujuan and Liu, Duo}, title = {VIFA: An Efficient Visible and Infrared Image Fusion Architecture for Multi-task Applications via Continual Learning}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {December}, year = {2024}, pages = {2872-2888} }