Semantics-Depth-Symbiosis: Deeply Coupled Semi-Supervised Learning of Semantics and Depth

Nitin Bansal, Pan Ji, Junsong Yuan, Yi Xu; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023, pp. 5828-5839

Abstract


Multi-task learning (MTL) paradigm focuses on jointly learning two or more tasks, aiming for an improvement w.r.t model's generalizability, performance, and training/inference memory footprint. The aforementioned benefits become ever so indispensable in the case of training for vision-related dense prediction tasks. In this work, we tackle the MTL problem of two dense tasks, i.e., semantic segmentation and depth estimation, and present a novel attention module called Cross-Channel Attention Module (CCAM), which facilitates effective feature sharing along each channel between the two tasks, leading to mutual performance gain with a negligible increase in trainable parameters. In a symbiotic spirit, we also formulate novel data augmentations for the semantic segmentation task using predicted depth called AffineMix, and one using predicted semantics called ColorAug, for depth estimation task. Finally, we validate the performance gain of the proposed method on the Cityscapes and ScanNet dataset. which helps us achieve state-of-the-art results for a semi-supervised joint model based on depth estimation and semantic segmentation.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Bansal_2023_WACV, author = {Bansal, Nitin and Ji, Pan and Yuan, Junsong and Xu, Yi}, title = {Semantics-Depth-Symbiosis: Deeply Coupled Semi-Supervised Learning of Semantics and Depth}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2023}, pages = {5828-5839} }