-
[pdf]
[bibtex]@InProceedings{Kim_2023_ICCV, author = {Kim, Dongkyun}, title = {CheXFusion: Effective Fusion of Multi-View Features Using Transformers for Long-Tailed Chest X-Ray Classification}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops}, month = {October}, year = {2023}, pages = {2702-2710} }
CheXFusion: Effective Fusion of Multi-View Features Using Transformers for Long-Tailed Chest X-Ray Classification
Abstract
Medical image classification poses unique challenges due to the long-tailed distribution of diseases, the cooccurrence of diagnostic findings, and the multiple views available for each study or patient. This paper introduces our solution to the ICCV CVAMD 2023 Shared Task on CXR-LT: Multi-Label Long-Tailed Classification on Chest X-Rays. Our approach introduces CheXFusion, a transformer-based fusion module incorporating multi-view images. The fusion module, guided by self-attention and cross-attention mechanisms, efficiently aggregates multiview features while considering label co-occurrence. Furthermore, we explore data balancing and self-training methods to optimize the model's performance. Our solution achieves state-of-the-art results with 0.372 mAP in the MIMIC-CXR test set, securing 1st place in the competition. Our success in the task underscores the significance of considering multi-view settings, class imbalance, and label co-occurrence in medical image classification. Public code is available at https://github. com/dongkyuk/CXR-LT-public-solution.
Related Material