-
[pdf]
[supp]
[arXiv]
[bibtex]@InProceedings{Zhang_2026_CVPR, author = {Zhang, Bo and Xu, Xinan and Yan, Shuo and Bai, Yu and Zhang, Zheng and Wang, Wufan and Gao, Hui and Wang, Wendong}, title = {Contrastive Cross-Bag Augmentation for Multiple Instance Learning-based Whole Slide Image Classification}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2026}, pages = {21089-21098} }
Contrastive Cross-Bag Augmentation for Multiple Instance Learning-based Whole Slide Image Classification
Abstract
Recent pseudo-bag augmentation methods for Multiple Instance Learning (MIL)-based Whole Slide Image (WSI) classification sample instances from a limited number of bags, resulting in constrained diversity. To address this issue, we propose Contrastive Cross-Bag Augmentation (C2Aug) to sample instances from all bags with the same class to increase the diversity of pseudo-bags. However, introducing new instances into the pseudo-bag increases the number of critical instances (e.g., tumor instances). This increase results in a reduced occurrence of pseudo-bags containing few critical instances, thereby limiting model performance, particularly on test slides with small tumor areas. To address this, we introduce a bag-level and group-level contrastive learning framework to enhance the discrimination of features with distinct semantic meanings, thereby improving model performance. C2Aug samples instances with consistent bag-level labels across the entire dataset, which simultaneously enhances the diversity of pseudo-bags and mitigates label noise. Experimental results demonstrate that C2Aug consistently outperforms state-of-the-art approaches across multiple evaluation metrics. Our code is publicly available at: https://github.com/weiaicunzai/mixup.
Related Material

