Anytime Recognition of Objects and Scenes

Sergey Karayev, Mario Fritz, Trevor Darrell; The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014, pp. 572-579


Humans are capable of perceiving a scene at a glance, and obtain deeper understanding with additional time. Similarly, visual recognition deployments should be robust to varying computational budgets. Such situations require Anytime recognition ability, which is rarely considered in computer vision research. We present a method for learning dynamic policies to optimize Anytime performance in visual architectures. Our model sequentially orders feature computation and performs subsequent classification. Crucially, decisions are made at test time and depend on observed data and intermediate results. We show the applicability of this system to standard problems in scene and object recognition. On suitable datasets, we can incorporate a semantic back-off strategy that gives maximally specific predictions for a desired level of accuracy; this provides a new view on the time course of human visual perception.

Related Material

author = {Karayev, Sergey and Fritz, Mario and Darrell, Trevor},
title = {Anytime Recognition of Objects and Scenes},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2014}