Deep Learning for Semantic Segmentation of Coral Reef Images Using Multi-View Information

Andrew King, Suchendra M.Bhandarkar, Brian M. Hopkinson; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2019, pp. 1-10

Abstract


Two major deep learning architectures, i.e., patch-based convolutional neural networks (CNNs) and fully convolutional neural networks (FCNNs), are studied in the context of semantic segmentation of underwater images of coral reef ecosystems. Patch-based CNNs are typically used to enable single-entity classification whereas FCNNs are used to generate a semantically segmented output from an input image. In coral reef mapping tasks, one typically obtains multiple images of a coral reef from varying viewpoints either using stereoscopic image acquisition or while conducting underwater video surveys. We propose and compare patch-based CNN and FCNN architectures capable of exploiting multi-view image information to improve the accuracy of classification and semantic segmentation of the input images. We investigate extensions of the conventional FCNN architecture to incorporate stereoscopic input image data and extensions of patch-based CNN architectures to incorporate multi-view input image data. Experimental results show the proposed TwinNet architecture to be the best performing FCNN architecture, performing comparably with its baseline Dilation8 architecture when using just a left-perspective input image, but markedly improving over Dilation8 when using a stereo pair of input images. Likewise, the proposed nViewNet-8 architecture is shown to be the best performing patch-based CNN architecture, outperforming its single-image ResNet152 baseline architecture in terms of classification accuracy.

Related Material


[pdf]
[bibtex]
@InProceedings{King_2019_CVPR_Workshops,
author = {King, Andrew and M.Bhandarkar, Suchendra and Hopkinson, Brian M.},
title = {Deep Learning for Semantic Segmentation of Coral Reef Images Using Multi-View Information},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2019}
}