Recurrent Filter Learning for Visual Tracking

Tianyu Yang, Antoni B. Chan; Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 2010-2019

Abstract


In this paper, we propose a recurrent filter generation methods for visual tracking. We directly feed the target's image patch to a recurrent neural network (RNN) to estimate an object-specific filter for tracking. As the video sequence is a spatiotemporal data, we extend the matrix multiplications of the fully-connected layers of the RNN to a convolution operation on feature maps, which preserves the target's spatial structure and also is memory-efficient. The tracked object in the subsequent frames will be fed into the RNN to adapt the generated filters to appearance variations of the target. Note that once the off-line training process of our network is finished, there is no need to fine-tune the network for specific objects, which makes our approach more efficient than methods that use iterative fine-tuning to online learn the target. Extensive experiments conducted on widely used benchmarks, OTB and VOT, demonstrate encouraging results compared to other recent methods.

Related Material


[pdf] [supp][arXiv]
[bibtex]
@InProceedings{Yang_2017_ICCV,
author = {Yang, Tianyu and Chan, Antoni B.},
title = {Recurrent Filter Learning for Visual Tracking},
booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV) Workshops},
month = {Oct},
year = {2017}
}