Multiple Instance Captioning: Learning Representations From Histopathology Textbooks and Articles

Jevgenij Gamper, Nasir Rajpoot; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 16549-16559

Abstract


We present ARCH, a computational pathology (CP) multiple instance captioning dataset to facilitate dense supervision of CP tasks. Existing CP datasets focus on narrow tasks; ARCH on the other hand contains dense diagnostic and morphological descriptions for a range of stains, tissue types and pathologies. Using intrinsic dimensionality estimation, we show that ARCH is the only CP dataset to (ARCH-)rival its computer vision analog MS-COCO Captions. We conjecture that an encoder pre-trained on dense image captions learns transferable representations for most CP tasks. We support the conjecture with evidence that ARCH representation transfers to a variety of pathology sub-tasks better than ImageNet features or representations obtained via self-supervised or multi-task learning on pathology images alone. We release our best model and invite other researchers to test it on their CP tasks.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Gamper_2021_CVPR, author = {Gamper, Jevgenij and Rajpoot, Nasir}, title = {Multiple Instance Captioning: Learning Representations From Histopathology Textbooks and Articles}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2021}, pages = {16549-16559} }