Robust Object Tracking Based on Temporal and Spatial Deep Networks

Zhu Teng, Junliang Xing, Qiang Wang, Congyan Lang, Songhe Feng, Yi Jin; The IEEE International Conference on Computer Vision (ICCV), 2017, pp. 1144-1153

Abstract


Recently deep neural networks have been widely employed to deal with the visual tracking problem. In this work, we present a new deep architecture which incorporates the temporal and spatial information to boost the tracking performance. Our deep architecture contains three networks, a Feature Net, a Temporal Net, and a Spatial Net. The Feature Net extracts general feature representations of the target. With these feature representations, the Temporal Net encodes the trajectory of the target and directly learns temporal correspondences to estimate the object state from a global perspective. Based on the learning results of the Temporal Net, the Spatial Net further refines the object tracking state using local spatial object information. Extensive experiments on four of the largest tracking benchmarks, including VOT2014, VOT2016, OTB50, and OTB100, demonstrate competing performance of the proposed tracker over a number of state-of-the-art algorithms.

Related Material


[pdf]
[bibtex]
@InProceedings{Teng_2017_ICCV,
author = {Teng, Zhu and Xing, Junliang and Wang, Qiang and Lang, Congyan and Feng, Songhe and Jin, Yi},
title = {Robust Object Tracking Based on Temporal and Spatial Deep Networks},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {Oct},
year = {2017}
}