Audio-Visual Feature Fusion for Vehicles Classification in a Surveillance System

Tao Wang, Zhigang Zhu, Riad Hammoud; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2013, pp. 381-386

Abstract


In this paper we tackle the challenging problem of multimodal feature selection and fusion for vehicle categorization. Our proposed framework utilizes a boosting-based feature learning technique to learn the optimal combinations of feature modalities. New multimodal features are learned from the existing unimodal features which are initially extracted from the data acquired by a novel audio-visual sensing system under different sensing conditions (long range, moving vehicles, and various environments). Experiments on a challenging dataset collected with our long-range sensing system demonstrated that the proposed technique is robust to noise and can find the best among multiple good feature modalities from training in terms of classification performance than the feature modality selection using a sequential based technique which tends to stay on a local maxima.

Related Material


[pdf]
[bibtex]
@InProceedings{Wang_2013_CVPR_Workshops,
author = {Wang, Tao and Zhu, Zhigang and Hammoud, Riad},
title = {Audio-Visual Feature Fusion for Vehicles Classification in a Surveillance System},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2013}
}