Learning Correspondence From the Cycle-Consistency of Time

Xiaolong Wang, Allan Jabri, Alexei A. Efros; The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 2566-2576

Abstract


We introduce a self-supervised method for learning visual correspondence from unlabeled video. The main idea is to use cycle-consistency in time as free supervisory signal for learning visual representations from scratch. At training time, our model learns a feature map representation to be useful for performing cycle-consistent tracking. At test time, we use the acquired representation to find nearest neighbors across space and time. We demonstrate the generalizability of the representation -- without finetuning -- across a range of visual correspondence tasks, including video object segmentation, keypoint tracking, and optical flow. Our approach outperforms previous self-supervised methods and performs competitively with strongly supervised methods.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Wang_2019_CVPR,
author = {Wang, Xiaolong and Jabri, Allan and Efros, Alexei A.},
title = {Learning Correspondence From the Cycle-Consistency of Time},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2019}
}