Joint Deep Learning for Pedestrian Detection

Wanli Ouyang, Xiaogang Wang; Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2013, pp. 2056-2063


Feature extraction, deformation handling, occlusion handling, and classi?cation are four important components in pedestrian detection. Existing methods learn or design these components either individually or sequentially. The interaction among these components is not yet well explored. This paper proposes that they should be jointly learned in order to maximize their strengths through cooperation. We formulate these four components into a joint deep learning framework and propose a new deep network architecture 1 . By establishing automatic, mutual interaction among components, the deep model achieves a 9% reduction in the average miss rate compared with the current best-performing pedestrian detection approaches on the largest Caltech benchmark dataset.

Related Material

