Coarse-to-Fine Semantic Video Segmentation Using Supervoxel Trees

Aastha Jain, Shuanak Chatterjee, Rene Vidal; The IEEE International Conference on Computer Vision (ICCV), 2013, pp. 1865-1872


We propose an exact, general and efficient coarse-to-fine energy minimization strategy for semantic video segmentation. Our strategy is based on a hierarchical abstraction of the supervoxel graph that allows us to minimize an energy defined at the finest level of the hierarchy by minimizing a series of simpler energies defined over coarser graphs. The strategy is exact, i.e., it produces the same solution as minimizing over the finest graph. It is general, i.e., it can be used to minimize any energy function (e.g., unary, pairwise, and higher-order terms) with any existing energy minimization algorithm (e.g., graph cuts and belief propagation). It also gives significant speedups in inference for several datasets with varying degrees of spatio-temporal continuity. We also discuss the strengths and weaknesses of our strategy relative to existing hierarchical approaches, and the kinds of image and video data that provide the best speedups.

Related Material

author = {Jain, Aastha and Chatterjee, Shuanak and Vidal, Rene},
title = {Coarse-to-Fine Semantic Video Segmentation Using Supervoxel Trees},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {December},
year = {2013}