Informative and Consistent Correspondence Mining for Cross-Domain Weakly Supervised Object Detection
Cross-domain weakly supervised object detection aims to adapt object-level knowledge from a fully labeled source domain dataset (i.e. with object bounding boxes) to train object detectors for target domains that are weakly labeled (i.e. with image-level tags). Instead of domain-level distribution matching, as popularly adopted in the literature, we propose to learn pixel-wise cross-domain correspondences for more precise knowledge transfer. It is realized through a novel cross-domain co-attention scheme trained as region competition. In this scheme, the cross-domain correspondence module seeks for informative features on the target domain image, which after being warped to the source domain image, could best explain its annotations. Meanwhile, a collaborative mask generator competes to mask out the relevant target image region to make the remaining features uninformative. Such competitive learning strives to correlate the full foreground in cross-domain image pairs, revealing the accurate object extent in target domain. To alleviate the ambiguity of inter-domain correspondence learning, a domain-cycle consistency regularizer is futher proposed to leverage the more reliable intra-domain correspondence. The proposed approach achieves consistent improvements over existing approaches by a considerable margin, demonstrated by the experiments on various datasets.