Bayesian Loss for Crowd Count Estimation With Point Supervision

Zhiheng Ma, Xing Wei, Xiaopeng Hong, Yihong Gong; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 6142-6151


In crowd counting datasets, each person is annotated by a point, which is usually the center of the head. And the task is to estimate the total count in a crowd scene. Most of the state-of-the-art methods are based on density map estimation, which convert the sparse point annotations into a "ground truth" density map through a Gaussian kernel, and then use it as the learning target to train a density map estimator. However, such a "ground-truth" density map is imperfect due to occlusions, perspective effects, variations in object shapes, etc. On the contrary, we propose Bayesian loss, a novel loss function which constructs a density contribution probability model from the point annotations. Instead of constraining the value at every pixel in the density map, the proposed training loss adopts a more reliable supervision on the count expectation at each annotated point. Without bells and whistles, the loss function makes substantial improvements over the baseline loss on all tested datasets. Moreover, our proposed loss function equipped with a standard backbone network, without using any external detectors or multi-scale architectures, plays favourably against the state of the arts. Our method outperforms previous best approaches by a large margin on the latest and largest UCF-QNRF dataset.

Related Material

[pdf] [video]
author = {Ma, Zhiheng and Wei, Xing and Hong, Xiaopeng and Gong, Yihong},
title = {Bayesian Loss for Crowd Count Estimation With Point Supervision},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2019}