Instance-Dependent Noise Refinement in Segment Anything Model for Weakly Supervised Object Detection

Taherkhani, Fariborz; Kazemi, Ehsan

Fariborz Taherkhani, Ehsan Kazemi; Proceedings of the Asian Conference on Computer Vision (ACCV), 2024, pp. 4505-4525

Abstract

We propose a new framework for Weakly Supervised Object Detection (WSOD), a domain that traditionally relies on image-level labels. In addressing the inherent limitations of current WSOD methods, particularly their reliance on image-level annotations that result in inaccurate bounding box selections, we develop a framework that iteratively utilizes weak supervision and refines it to progressively enhance the supervision of the object detector throughout the training process. Specifically, we employ the Segment Anything Model (SAM) to generate initial pseudo-labels bounding boxes from the point prompts generated by Class Activation Mapping (CAM). Our approach tackles the challenge of label noise, where pseudo-labels bounding boxes might only capture parts of objects. We enhance our ability to distinguish between complete and partial detected objects by leveraging an instance-dependent, particularly part-based noise correction model. Our method is inspired by learning methods focusing on part-based representations for object detection and recognition, as well as from human perception, which typically simplifies complex visual information into simpler, constituent parts. Our experiments, conducted in various settings beyond WSOD, including Semi-Supervised Object Detection (SSOD) and Weakly Supervised Instance Segmentation (WSIS), validate the efficacy of our approach.

Related Material

[pdf]

[bibtex]

@InProceedings{Taherkhani_2024_ACCV, author = {Taherkhani, Fariborz and Kazemi, Ehsan}, title = {Instance-Dependent Noise Refinement in Segment Anything Model for Weakly Supervised Object Detection}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {December}, year = {2024}, pages = {4505-4525} }