Joint Detection and Identification Feature Learning for Person Search

Tong Xiao, Shuang Li, Bochao Wang, Liang Lin, Xiaogang Wang; The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 3415-3424

Abstract


Existing person re-identification benchmarks and methods mainly focus on matching cropped pedestrian images between queries and candidates. However, it is different from real-world scenarios where the annotations of pedestrian bounding boxes are unavailable and the target person needs to be searched from a gallery of whole scene images. To close the gap, we propose a new deep learning framework for person search. Instead of breaking it down into two separate tasks---pedestrian detection and person re-identification, we jointly handle both aspects in a single convolutional neural network. An Online Instance Matching (OIM) loss function is proposed to train the network effectively, which is scalable to datasets with numerous identities. To validate our approach, we collect and annotate a large-scale benchmark dataset for person search. It contains 18,184 images, 8,432 identities, and 96,143 pedestrian bounding boxes. Experiments show that our framework outperforms other separate approaches, and the proposed OIM loss function converges much faster and better than the conventional Softmax loss.

Related Material


[pdf] [arXiv] [poster] [video]
[bibtex]
@InProceedings{Xiao_2017_CVPR,
author = {Xiao, Tong and Li, Shuang and Wang, Bochao and Lin, Liang and Wang, Xiaogang},
title = {Joint Detection and Identification Feature Learning for Person Search},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {July},
year = {2017}
}