-
[pdf]
[supp]
[arXiv]
[bibtex]@InProceedings{Yang_2024_CVPR, author = {Yang, Senqiao and Tian, Zhuotao and Jiang, Li and Jia, Jiaya}, title = {Unified Language-driven Zero-shot Domain Adaptation}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {23407-23415} }
Unified Language-driven Zero-shot Domain Adaptation
Abstract
This paper introduces Unified Language-driven Zero-shot Domain Adaptation (ULDA) a novel task setting that enables a single model to adapt to diverse target domains without explicit domain-ID knowledge. We identify the constraints in the existing language-driven zero-shot domain adaptation task particularly the requirement for domain IDs and domain-specific models which may restrict flexibility and scalability. To overcome these issues we propose a new framework for ULDA consisting of Hierarchical Context Alignment (HCA) Domain Consistent Representation Learning (DCRL) and Text-Driven Rectifier (TDR). These components work synergistically to align simulated features with target text across multiple visual levels retain semantic correlations between different regional representations and rectify biases between simulated and real target visual features respectively. Our extensive empirical evaluations demonstrate that this framework achieves competitive performance in both settings surpassing even the model that requires domain-ID showcasing its superiority and generalization ability. The proposed method is not only effective but also maintains practicality and efficiency as it does not introduce additional computational costs during inference. The code is available on the project website.
Related Material