Per-Sample Kernel Adaptation for Visual Recognition and Grouping

Borislav Antic, Bjorn Ommer; The IEEE International Conference on Computer Vision (ICCV), 2015, pp. 1251-1259


Object, action, or scene representations that are corrupted by noise significantly impair the performance of visual recognition. Typically, partial occlusion, clutter, or excessive articulation affects only a subset of all feature dimensions and, most importantly, different dimensions are corrupted in different samples. Nevertheless, the common approach to this problem in feature selection and kernel methods is to down-weight or eliminate entire training samples or the same dimensions of all samples. Thus, valuable signal is lost, resulting in suboptimal classification. Our goal is, therefore, to adjust the contribution of individual feature dimensions when comparing any two samples and computing their similarity. Consequently, per-sample selection of informative dimensions is directly integrated into kernel computation. The interrelated problems of learning the parameters of a kernel classifier and determining the informative components of each sample are then addressed in a joint objective function. The approach can be integrated into the learning stage of any kernel-based visual recognition problem and it does not affect the computational performance in the retrieval phase. Experiments on diverse challenges of action recognition in videos and indoor scene classification show the general applicability of the approach and its ability to improve learning of visual representations.

Related Material

author = {Antic, Borislav and Ommer, Bjorn},
title = {Per-Sample Kernel Adaptation for Visual Recognition and Grouping},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {December},
year = {2015}