-
[pdf]
[bibtex]@InProceedings{Yu_2026_CVPR, author = {Yu, Jian and Feng, Yujian and You, Shuai and Zhou, Zhongkai and Wu, Fei and Jing, Zhengjun and Ji, Yimu}, title = {Spatial-Frequency Collaborative Learning for Occluded Visible-Infrared Person Re-Identification}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2026}, pages = {4343-4352} }
Spatial-Frequency Collaborative Learning for Occluded Visible-Infrared Person Re-Identification
Abstract
Occluded visible-infrared person re-identification (Occluded VI-ReID) remains difficult due to modality heterogeneity and occlusions, both of which break structural consistency and weaken cross-modality feature alignment. Existing methods rely mainly on spatial-domain cues (such as local body parts and salient patches), but their discriminability degrades severely under varying imaging conditions or partial visibility. To address these issues, we introduce a spatial-frequency collaborative perspective that offers global perception and cross-location consistency. Specifically, we propose a Spatial-Frequency Collaborative Learning (SFCL) framework that uses frequency information to complement spatial representations. SFCL comprises a Cross-Modality Frequency Alignment Module (CFAM), a Spatial-Frequency Interaction Module (SFIM), and a Frequency-Aware Discriminative (FAD) loss. The CFAM models the spectral features of visible/infrared images in the frequency domain, establishing modality-consistent spectral priors. The SFIM injects these priors into spatial features, promoting dual-domain interaction and complementary representations of spatial and frequency semantics. In addition, the FAD loss jointly enforces cross-modality frequency alignment and semantic consistency, thus enhancing robustness and discriminability under occlusions. For real-occlusion evaluation, we construct two occluded datasets, Occ-SYSU-MM01 and Occ-RegDB, on which SFCL outperforms the state-of-the-art.
Related Material

