Hierarchical X-Ray Report Generation via Pathology tags and Multi Head Attention

Preethi Srinivasan, Daksh Thapar, Arnav Bhavsar, Aditya Nigam; Proceedings of the Asian Conference on Computer Vision (ACCV), 2020

Abstract


Examining radiology images, such as X-Ray images as accurately as possible, forms a crucial step in providing the best healthcare facilities. However, this requires high expertise and clinical experience. Even for experienced radiologists, this is a time-consuming task. Hence, the automated generation of accurate radiology reports from chest X-Ray images is gaining popularity. Compared to other image captioning tasks where coherence is the key criterion, medical image captioning requires high accuracy in detecting anomalies and extracting information along with coherence. That is, the report must be easy to read and convey medical facts accurately. We propose a deep neural network to achieve this. Given a set of Chest X-Ray images of the patient, the proposed network predicts the medical tags and generates a readable radiology report. For generating the report and tags, the proposed network learns to extract salient features of the image from a deep CNN and generates tag embeddings for each patient's X-Ray images. We use transformers for learning self and cross attention. We encode the image and tag features with self-attention to get a finer representation. Use both the above features in cross attention with the input sequence to generate the report's Findings. Then, cross attention is applied between the generated Findings and the input sequence to generate the report's Impressions. We use a publicly available dataset to evaluate the proposed network. The performance indicates that we can generate a readable radiology report, with a relatively higher BLEU score over SOTA. The code and trained models are available at https://medicalcaption.github.io

Related Material


[pdf]
[bibtex]
@InProceedings{Srinivasan_2020_ACCV, author = {Srinivasan, Preethi and Thapar, Daksh and Bhavsar, Arnav and Nigam, Aditya}, title = {Hierarchical X-Ray Report Generation via Pathology tags and Multi Head Attention}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {November}, year = {2020} }