BubbLeNet: Foveated Imaging for Visual Discovery

Kevin Matzen, Noah Snavely; Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2015, pp. 1931-1939


We propose a new method for turning an Internet-scale corpus of categorized images into a small set of human-interpretable discriminative visual elements using powerful tools based on deep learning. A key challenge with deep learning methods is generating human-interpretable models. To address this, we propose a new technique that uses bubble images -- images where most of the content has been obscured -- to identify spatially localized, discriminative content in each image. By modifying the model training procedure to use both the source imagery and these bubble images, we can arrive at final models which retain much of the original classification performance, but are much more amenable to identifying interpretable visual elements. We apply our algorithm to a wide variety of datasets, including two new Internet-scale datasets of people and places, and show applications to visual mining and discovery. Our method is simple, scalable, and produces visual elements that are highly representative compared to prior work.

Related Material

author = {Matzen, Kevin and Snavely, Noah},
title = {BubbLeNet: Foveated Imaging for Visual Discovery},
booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
month = {December},
year = {2015}