Benchmarking Gaze Prediction for Categorical Visual Search

Gregory Zelinsky, Zhibo Yang, Lihan Huang, Yupei Chen, Seoyoung Ahn, Zijun Wei, Hossein Adeli, Dimitris Samaras, Minh Hoai; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2019, pp. 0-0


Movements of human attention during free viewing have received wide interest in the computer vision community. However, search behavior, where the fixation scanpaths are highly dependent on the viewer's goals, has received much less attention, even though visual search constitutes much of human everyday behavior. One reason is the absence of real-world image datasets on which models of search can be trained. In this paper we present a carefully created dataset for two target categories, microwaves and clocks, curated from the COCO2014 dataset. A total of 2183 images were presented to multiple participants, who were tasked to search for one of the two categories. This yields a total of 16184 validated fixations used for training, making our microwave-clock dataset currently one of the largest datasets of eye fixations in categorical search. Another contribution is our collection of a 40-image testing dataset, where images contained both a microwave and a clock target. Distinct fixation patterns emerged depending on whether participants searched for a microwave (n=30) or a clock (n=30) in the same images. Models therefore had to predict different search scanpaths from the same pixel inputs. This dataset will provide a useful testbed for methods of generating category-specific priority maps for the modeling of visual search behavior. We have implemented a number of state-of-the-art models that will be made available with the dataset, together with a protocol for quantitative and qualitative evaluations.

