Enhancing Multi-View Pedestrian Detection Through Generalized 3D Feature Pulling

Sithu Aung, Haesol Park, Hyungjoo Jung, Junghyun Cho; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024, pp. 1196-1205

Abstract


The main challenge in multi-view pedestrian detection is integrating view-specific features into a unified space for comprehensive end-to-end perception. Prior multi-view detection methods have focused on projecting perspective-view features onto the ground plane, creating a "bird's eye view" (BEV) representation of the scene. This paper proposes a simple but effective architecture that utilizes a non-parametric 3D feature-pulling strategy. This strategy directly extracts the corresponding 2D features for each valid voxel within the 3D feature volume, addressing the feature loss that may arise in previous methods. The proposed framework introduces three novel modules, each crafted to bolster the generalization capabilities of multi-view detection systems. Through extensive experiments, the efficacy of the proposed model is demonstrated. The results show a new state-of-the-art accuracy, both in conventional scenarios and particularly in the context of scene generalization benchmarks.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Aung_2024_WACV, author = {Aung, Sithu and Park, Haesol and Jung, Hyungjoo and Cho, Junghyun}, title = {Enhancing Multi-View Pedestrian Detection Through Generalized 3D Feature Pulling}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2024}, pages = {1196-1205} }