BAMG: Text-based Person Re-identification via Bottlenecks Attention and Masked Graph Modeling

Keyang Cheng, Wenxuan Zou, Hongjian Gu, Anxiang Ouyang; Proceedings of the Asian Conference on Computer Vision (ACCV), 2024, pp. 1809-1826

Abstract


In the realm of computer vision, traditional person re-identifi-cation (ReID) methods have primarily focused on matching pedestrian identities across varied cameras and temporal instances. Text-based Person Re-identification (TBPReID) extends these efforts by utilizing textual descriptions alongside images to enhance retrieval applications, such as tracking suspects or locating missing children. A Text-based Person Re-identification framework based on bottleneck attention and masked graph modeling(BAMG) is introduced in this paper, which incorporates the prowess of CLIP's pre-trained models into an advanced architecture. BAMG features a bottleneck fusion module for optimized modal integration, a Masked Graph Modeling (MGM) component for enhanced feature extraction, and additional supportive modules that refine the processing of multimodal data. BAMG not only enhances the alignment and interaction between text and image data but also significantly boosts the accuracy and robustness of the identification process. Through evaluations on the CUHK-PEDES dataset, the BAMG model has achieved a rank-1 accuracy of 79% and a mean average precision (mAP) of 68%. These results establish BAMG as a leading framework, setting new benchmarks for performance and adaptability in the field of multimodal learning environments focused on text-based person re-identification.

Related Material


[pdf]
[bibtex]
@InProceedings{Cheng_2024_ACCV, author = {Cheng, Keyang and Zou, Wenxuan and Gu, Hongjian and Ouyang, Anxiang}, title = {BAMG: Text-based Person Re-identification via Bottlenecks Attention and Masked Graph Modeling}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {December}, year = {2024}, pages = {1809-1826} }