MedSkip: Medical Report Generation Using Skip Connections and Integrated Attention
Medical scans are extremely important for accurate diagnosis and treatment. To assist staff members in such crucial tasks, developing a computer vision model that efficiently processes a medical image and results in a generated report can be highly beneficial. Such a robust system can not only act as a helping hand for professionals but also eliminate the chances of error that might arise in the case of in-experienced staff members. However, previous studies lack focus on experimenting with the visual extractor, which is of eminent importance. Keeping this in mind, we propose a novel architecture of a modified HRNet which includes added skip connections along with convolutional block attention modules (CBAM). The entire architecture can be divided into two components, the first being the visual extractor where the pre-processed image is fed into the HRNet convolutional layers. Outputs of each down-sampled layer are concatenated after passing through the attention modules. The second component includes the use of a memory-driven transformer that generates the report. We evaluate our model on two publicly available datasets, PEIR Gross and IU X-Ray, establishing new state-of-the-art for PEIR Gross while giving competitive results for IU X-Ray.