Dense-Scale Feature Learning in Person Re-Identification

Li Wang, Baoyu Fan, Zhenhua Guo, Yaqian Zhao, Runze Zhang, Rengang Li, Weifeng Gong; Proceedings of the Asian Conference on Computer Vision (ACCV), 2020


For mass pedestrians re-identification (Re-ID), models must be capable of representing extremely complex and diverse multi-scale features. However, existing models only learn limited multi-scale features in a multi-branches manner, and directly expanding the number of scale branches for more scales will confuse the discrimination and affect performance. Because for a specific input image, there are a few scale features that are critical. In order to fulfill vast scale representation for person Re-ID and solve the contradiction of excessive scale declining performance, we proposed a novel Dense-Scale Feature Learning Network (DSLNet) which consist of two core components: Dense Connection Group (DCG) for providing abundant scale features, and Channel-Wise Scale Selection (CSS) module for dynamic select the most discriminative scale features to each input image. DCG is composed of a densely connected convolutional stream. The receptive field gradually increases as the feature flows along the convolution stream. Dense shortcut connections provide much more fused multi-scale features than existing methods. CSS is a novel attention module different from any existing model which calculates attention along the branch direction. By enhancing or suppressing specific scale branches, truly channel-wised multi-scale selection is realized. To the best of our knowledge, DSLNet is most lightweight and achieves state-of-the-art performance among lightweight models on four commonly used Re-ID datasets, surpassing most large-scale models.

Related Material

@InProceedings{Wang_2020_ACCV, author = {Wang, Li and Fan, Baoyu and Guo, Zhenhua and Zhao, Yaqian and Zhang, Runze and Li, Rengang and Gong, Weifeng}, title = {Dense-Scale Feature Learning in Person Re-Identification}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {November}, year = {2020} }