-
[pdf]
[bibtex]@InProceedings{Myers-Dean_2025_CVPR, author = {Myers-Dean, Josh and Liu, Kangning and Price, Brian and Fan, Yifei and Kuen, Jason and Gurari, Danna}, title = {conSAMme: Achieving Consistent Segmentations with SAM}, booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR) Workshops}, month = {June}, year = {2025}, pages = {759-768} }
conSAMme: Achieving Consistent Segmentations with SAM
Abstract
Multi-output interactive segmentation methods generate multiple binary masks when given user guidance, such as clicks. However, it is unpredictable whether the order of the masks will match or whether those masks will be the same when given slightly different user guidance. To address these issues, we propose conSAMme, a contrastive learning framework that conditions on explicit hierarchical semantics and leverages weakly supervised part segmentation data and a novel episodic click sampling strategy. Evaluation of conSAMme's performance, click robustness, and mask ordering show substantial improvements to baselines with less than 1% extra training data compared to the amount of data used for the baseline.
Related Material