UCT: Learning Unified Convolutional Networks for Real-Time Visual Tracking

Zheng Zhu, Guan Huang, Wei Zou, Dalong Du, Chang Huang; The IEEE International Conference on Computer Vision (ICCV), 2017, pp. 1973-1982

Abstract


In this paper, we propose an end-to-end framework to learn the convolutional features and perform the tracking process simultaneously, namely, a unified convolutional tracker (UCT). Specifically, The UCT treats feature extractor and tracking process (ridge regression) both as convolution operation and trains them jointly, enabling learned CNN features are tightly coupled to tracking process. In online tracking, an efficient updating method is proposed by introducing peak-versus-noise ratio (PNR) criterion, and scale changes are handled efficiently by incorporating a scale branch into network. The proposed approach results in superior tracking performance, while maintaining real-time speed. Experiments are performed on four challenging benchmark tracking datasets: OTB2013, OTB2015, VOT2014 and VOT2015, and our method achieves state-of-the-art results on these benchmarks compared with other real-time trackers.

Related Material


[pdf]
[bibtex]
@InProceedings{Zhu_2017_ICCV,
author = {Zhu, Zheng and Huang, Guan and Zou, Wei and Du, Dalong and Huang, Chang},
title = {UCT: Learning Unified Convolutional Networks for Real-Time Visual Tracking},
booktitle = {The IEEE International Conference on Computer Vision (ICCV) Workshops},
month = {Oct},
year = {2017}
}