Salient Object Detection for Images Taken by People With Vision Impairments

Jarek Reynolds, Chandra Kanth Nagesh, Danna Gurari; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024, pp. 8522-8531

Abstract


Salient object detection is the task of producing a binary mask for an image that deciphers which pixels belong to the foreground object versus background. We introduce a new salient object detection dataset using images taken by people who are visually impaired who were seeking to better understand their surroundings, which we call VizWiz-SalientObject. Compared to seven existing datasets, VizWiz-SalientObject is the largest (i.e., 32,000 human-annotated images) and contains unique characteristics including a higher prevalence of text in the salient objects (i.e., in 68% of images) and salient objects that occupy a larger ratio of the images (i.e., on average, 50% coverage). We benchmarked ten modern models on our dataset. While most methods fall below human performance, struggling most for images with salient objects that are large, have less complex boundaries, and lack text as well as for lower quality images, one method one method is very close. To facilitate future extensions of this work, we publicly share the dataset at https://vizwiz.org/tasks-and-datasets/salient-object-detection.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Reynolds_2024_WACV, author = {Reynolds, Jarek and Nagesh, Chandra Kanth and Gurari, Danna}, title = {Salient Object Detection for Images Taken by People With Vision Impairments}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2024}, pages = {8522-8531} }