Multi-Video Temporal Synchronization by Matching Pose Features of Shared Moving Subjects

Xinyi Wu, Zhenyao Wu, Yujun Zhang, Lili Ju, Song Wang; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 0-0

Abstract


Collaborative analysis of videos taken by multiple motion cameras from different and time-varying views can help solve many computer vision problems. However, such collaborative analysis usually requires the videos to be temporally synchronized, which can be inaccurate if we solely rely on camera clock. In this paper, we propose to address this problem based on video content. More specifically, if multiple videos cover the same moving persons, these subjects shall exhibit identical pose and pose change at each aligned time point across these videos. Based on this idea, we develop a new Synchronization Network (SynNet) which includes a feature aggregation module, a matching cost volume and several classification layers to infer the time offset between different videos by exploiting view-invariant human pose features. We conduct comprehensive experiments on SYN, SPVideo and MPVideo datasets. The results show that the proposed method can accurately synchronize multiple motion-camera videos collected in real world.

Related Material


[pdf]
[bibtex]
@InProceedings{Wu_2019_ICCV,
author = {Wu, Xinyi and Wu, Zhenyao and Zhang, Yujun and Ju, Lili and Wang, Song},
title = {Multi-Video Temporal Synchronization by Matching Pose Features of Shared Moving Subjects},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops},
month = {Oct},
year = {2019}
}