Instance-Based Video Search via Multi-Task Retrieval and Re-Ranking

Zhicheng Zhao, Guanyu Chen, Chong Chen, Xinyu Li, Xuanlu Xiang, Yanyun Zhao, Fei Su; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 0-0


With the rapid growth of video data, instance-based video search (INS), i.e., retrieving videos according to specific objects, places, actions etc., has become more and more practical and important. In this paper, a novel INS framework based on multi-task retrieval and re-ranking is proposed to retrieve particular person doing specific action. Firstly, a face matching scheme is designed to match the target persons from videos. Secondly, an object detection network and an improved two-pathway key-pose estimation network (IECO) are introduced to explore semantic depen-dences between static visual object and person's behavior. Based on the dependences, an initial INS ranklist is obtained. Thirdly, via encoding absolute and relative positions of person's poses, a new relative pose representation (RPR) method is presented. Finally, regarding RPR as the input, a light action recognition network is constructed to re-rank INS results. The experimental results on HMDB, UCF101, JHMDB and BBC Eastenders datasets demonstrate the effectiveness of the proposed INS framework.

Related Material

author = {Zhao, Zhicheng and Chen, Guanyu and Chen, Chong and Li, Xinyu and Xiang, Xuanlu and Zhao, Yanyun and Su, Fei},
title = {Instance-Based Video Search via Multi-Task Retrieval and Re-Ranking},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops},
month = {Oct},
year = {2019}