AdaMT-Net: An Adaptive Weight Learning Based Multi-Task Learning Model for Scene Understanding

Ankit Jha, Awanish Kumar, Biplab Banerjee, Subhasis Chaudhuri; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2020, pp. 706-707

Abstract


We tackle the problem of deep end-to-end multi-task learning (MTL) for visual scene understanding from monocular images in this paper. It is proven that learning several related tasks together helps in attaining improved performance per-task than training them independently. This is due to the fact that related tasks share important feature properties among themselves, which the MTL techniques can effectively explore for improved joint training. Following the same, we are interested in judiciously segregating the task-centric feature learning stage from a learnable task-generic feature space. To this end, we propose a typical U-Net based encoder-decoder architecture called AdaMT-Net where the densely-connected deep convolutional neural network (CNN) based feature encoder is shared among the tasks while the soft-attention based task-specific decoder modules produce the desired outcomes at the end. One major issue in MTL is to select the weights for the task-specific loss-terms in the final optimization function. As opposed to manual weight selection, we propose a novel adaptive weight learning strategy by carefully exploring the loss-gradients per-task in different training iterations. We validate AdaMT-Net on the challenging CityScapes, NYUv2, and ISPRS datasets, where consistently improved performance can be observed.

Related Material


[pdf]
[bibtex]
@InProceedings{Jha_2020_CVPR_Workshops,
author = {Jha, Ankit and Kumar, Awanish and Banerjee, Biplab and Chaudhuri, Subhasis},
title = {AdaMT-Net: An Adaptive Weight Learning Based Multi-Task Learning Model for Scene Understanding},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2020}
}