MoDA: Leveraging Motion Priors from Videos for Advancing Unsupervised Domain Adaptation in Semantic Segmentation

Fei Pan, Xu Yin, Seokju Lee, Axi Niu, Sungeui Yoon, In So Kweon; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 2649-2658

Abstract


Unsupervised domain adaptation (UDA) has been a potent technique to handle the lack of annotations in the target domain particularly in semantic segmentation task. This study introduces a different UDA scenarios where the target domain contains unlabeled video frames. Drawing upon recent advancements of self-supervised learning of the object motion from unlabeled videos with geometric constraint we design a Motion-guided Domain Adaptive semantic segmentation framework (MoDA). MoDA harnesses the self-supervised object motion cues to facilitate cross-domain alignment for segmentation task. First we present an object discovery module to localize and segment target moving objects using object motion information. Then we propose a semantic mining module that takes the object masks to refine the pseudo labels in the target domain. Subsequently these high-quality pseudo labels are used in the self-training loop to bridge the cross-domain gap. On domain adaptive video and image segmentation experiments MoDA shows the effectiveness utilizing object motion as guidance for domain alignment compared with optical flow information. Moreover MoDA exhibits versatility as it can complement existing state-of-the-art UDA approaches.

Related Material


[pdf] [arXiv]
[bibtex]
@InProceedings{Pan_2024_CVPR, author = {Pan, Fei and Yin, Xu and Lee, Seokju and Niu, Axi and Yoon, Sungeui and Kweon, In So}, title = {MoDA: Leveraging Motion Priors from Videos for Advancing Unsupervised Domain Adaptation in Semantic Segmentation}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {2649-2658} }