Group-Wise Contrastive Bottleneck for Weakly-Supervised Visual Representation Learning

Boon Peng Yap, Beng Koon Ng; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024, pp. 2246-2255


Coarse or weak labels can serve as a cost-effective solution to the problem of visual representation learning. When fine-grained labels are unavailable, weak labels can provide some form of supervisory signals to guide the representation learning process. Some examples of weak labels include image captions, visual attributes and coarse-grained object categories. In this work, we consider the semantic grouping relationship that exists within certain types of weak labels and propose a group-wise contrastive bottleneck module to leverage this relationship. The semantic group may contain labels that are related to a general concept, such as the colour or shape of objects. Using the group-wise bottleneck module, we disentangle the global image features into multiple group features and apply contrastive learning in a group-wise manner to maximize the similarity of positive pairs within each semantic group. The positive pairs are defined based on the similarity of the labels captured by each group. To learn a more robust representation, we introduce a reconstruction objective where an image feature is reconstructed back from the disentangled features, and this reconstruction is encouraged to be consistent with the feature obtained from a different augmented view of the same image. We empirically verify the efficacy of the proposed method on several datasets in the context of visual attribute learning, fair representation learning and hierarchical label learning. The experimental results indicate that our proposed method outperforms prior weakly-supervised methods and is flexible in adapting to different representation learning settings.

Related Material

[pdf] [supp]
@InProceedings{Yap_2024_WACV, author = {Yap, Boon Peng and Ng, Beng Koon}, title = {Group-Wise Contrastive Bottleneck for Weakly-Supervised Visual Representation Learning}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2024}, pages = {2246-2255} }