Clinical Scene Segmentation with Tiny Datasets

Thomas J. Smith, Michel Valstar, Don Sharkey, John Crowe; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 0-0


Many clinical procedures could benefit from automatic scene segmentation and subsequent action recognition. Using Convolutional Neural Networks to semantically segment meaningful parts of an image or video is still an unsolved problem. This becomes even more apparent when only a small dataset is available. Whilst using RGB as the input is sufficient for a large labelled dataset, achieving high accuracy on a small dataset directly from RGB is difficult. This is because the ratio of free image dimensions to the number of training images is very high, resulting in unavoidable underfitting. We show that the addition of superpixels to represent an image in our network improves the semantic segmentation, and that superpixels can be learned to be detected by Convolutional Neural Networks if those superpixels are appropriately represented. Here we present a novel representation for superpixels, multichannel connected graphs (MCGs). We show how using pre-trained deep learned superpixels used in an end-to-end manner achieve good semantic segmentation results without the need for large quantities of labelled data, by training with only 20 instances for 23 classes.

Related Material

author = {Smith, Thomas J. and Valstar, Michel and Sharkey, Don and Crowe, John},
title = {Clinical Scene Segmentation with Tiny Datasets},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops},
month = {Oct},
year = {2019}