Pyramid Coding for Functional Scene Element Recognition in Video Scenes

Eran Swears, Anthony Hoogs, Kim Boyer; The IEEE International Conference on Computer Vision (ICCV), 2013, pp. 345-352

Abstract


Recognizing functional scene elemeeents in video scenes based on the behaviors of moving o bjects that interact with them is an emerging problem of interest. Existing approaches have a limited ability to chhharacterize elements such as cross-walks, intersections, anddd buildings that have low activity, are multi-modal, or haveee indirect evidence. Our approach recognizes the low activvvity and multi-model elements (crosswalks/intersections) by introducing a hierarchy of descriptive clusters to ffform a pyramid of codebooks that is sparse in the numbbber of clusters and dense in content. The incorporation ooof local behavioral context such as person-enter-building aaand vehicle-parking nearby enables the detection of elemennnts that do not have direct motion-based evidence, e.g. buuuildings. These two contributions significantly improveee scene element recognition when compared against thhhree state-of-the-art approaches. Results are shown on tyyypical ground level surveillance video and for the first time on the more complex Wide Area Motion Imagery.

Related Material


[pdf]
[bibtex]
@InProceedings{Swears_2013_ICCV,
author = {Swears, Eran and Hoogs, Anthony and Boyer, Kim},
title = {Pyramid Coding for Functional Scene Element Recognition in Video Scenes},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {December},
year = {2013}
}