Ground Truth for Pedestrian Analysis and Application to Camera Calibration

Clement Creusot, Nicolas Courty; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2013, pp. 712-718

Abstract


This paper investigates the use of synthetic 3D scenes to generate ground truth of pedestrian segmentation in 2D crowd video data. Manual segmentation of objects in videos is indeed one of the most time-consuming type of assisted labeling. A big gap in computer vision research can not be filled due to this lack of temporally dense and precise segmentation ground truth on large video samples. Such data is indeed essential to introduce machine learning techniques for automatic pedestrian segmentation, as well as many other applications involving occluded people. We present a new dataset of 1.8 million pedestrian silhouettes presenting human-to-human occlusion patterns likely to be seen in real crowd video data. To our knowledge, it is the first publicly available large dataset of pedestrian in crowd silhouettes. Solutions to generate and represent this data are detailed. We discuss ideas of how this ground truth can be used for a large number of computer vision applications and demonstrate it on a camera calibration toy problem.

Related Material


[pdf]
[bibtex]
@InProceedings{Creusot_2013_CVPR_Workshops,
author = {Creusot, Clement and Courty, Nicolas},
title = {Ground Truth for Pedestrian Analysis and Application to Camera Calibration},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2013}
}