Distortion-Disentangled Contrastive Learning

Jinfeng Wang, Sifan Song, Jionglong Su, S. Kevin Zhou; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024, pp. 75-85

Abstract


Self-supervised learning is well known for its remarkable performance in representation learning and various downstream computer vision tasks. Recently, Positive-pair-Only Contrastive Learning (POCL) has achieved reliable performance without the need to construct positive-negative training sets. It reduces memory requirements by lessening the dependency on the batch size. The POCL method typically uses a single objective function to extract the distortion invariant representation (DIR), which describes the proximity of positive-pair representations affected by different distortions. This objective function implicitly enables the model to filter out or ignore the distortion variant representation (DVR) affected by different distortions. However, some recent studies have shown that proper use of DVR in contrastive can optimize the performance of models in some downstream domain-specific tasks. In addition, these POCL methods have been observed to be sensitive to augmentation strategies. To address these limitations, we propose a novel POCL framework named Distortion-Disentangled Contrastive Learning (DDCL) and a Distortion-Disentangled Loss (DDL). Our approach is the first to explicitly and adaptively disentangle and exploit the DVR inside the model and feature stream to improve the overall representation utilization efficiency, robustness, and representation ability. Experiments demonstrate our framework's superiority to Barlow Twins and Simsiam in terms of convergence, representation quality (Including transferability and generality), and robustness on several benchmark datasets.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Wang_2024_WACV, author = {Wang, Jinfeng and Song, Sifan and Su, Jionglong and Zhou, S. Kevin}, title = {Distortion-Disentangled Contrastive Learning}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2024}, pages = {75-85} }