Multimodal Material Segmentation

Yupeng Liang, Ryosuke Wakaki, Shohei Nobuhara, Ko Nishino; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 19800-19808


Recognition of materials from their visual appearance is essential for computer vision tasks, especially those that involve interaction with the real world. Material segmentation, i.e., dense per-pixel recognition of materials, remains challenging as, unlike objects, materials do not exhibit clearly discernible visual signatures in their regular RGB appearances. Different materials, however, do lead to different radiometric behaviors, which can often be captured with non-RGB imaging modalities. We realize multimodal material segmentation from RGB, polarization, and near-infrared images. We introduce the MCubeS dataset (from MultiModal Material Segmentation) which contains 500 sets of multimodal images capturing 42 street scenes. Ground truth material segmentation as well as semantic segmentation are annotated for every image and pixel. We also derive a novel deep neural network, MCubeSNet, which learns to focus on the most informative combinations of imaging modalities for each material class with a newly derived region-guided filter selection (RGFS) layer. We use semantic segmentation, as a prior to "guide" this filter selection. To the best of our knowledge, our work is the first comprehensive study on truly multimodal material segmentation. We believe our work opens new avenues of practical use of material information in safety critical applications.

Related Material

[pdf] [supp]
@InProceedings{Liang_2022_CVPR, author = {Liang, Yupeng and Wakaki, Ryosuke and Nobuhara, Shohei and Nishino, Ko}, title = {Multimodal Material Segmentation}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2022}, pages = {19800-19808} }