MS-UMLP: Medical Image Segmentation via Multi-Scale U-shape MLP-Mixer

Bin Xie, Hao Tang, Dawen Cai, Yan Yan; Proceedings of the Asian Conference on Computer Vision (ACCV), 2024, pp. 1793-1808

Abstract


With the emergence and rapid development of Transformers, medical image segmentation has also been revolutionized by Transformers due to their ability to encode long-range dependencies. Despite their advantages, Transformers also come with some drawbacks, such as larger models being built, resulting in more parameters being introduced. In some cases, several times the parameters may only result in marginal improvements. Additionally, medical segmentation images typically consist of multiple classes, with significant differences in size among classes and minimal differences within each class, which can be addressed via a multiple-scale model. In this paper, we proposed a novel Multi-Scale U-shape MLP-Mixer network named MS-UMLP, which aims to achieve multiple-scale receptive fields while using fewer parameters. Unlike the prevailing transformer-based trend of building models with more parameters, our MS-UMLP adopts dimension-wise multi-scale MLP-Mixer blocks via redesigning MLP-Mixer to reduce model parameters and computational complexity, retain the ability to exploit long-term dependencies, and provide the ability to capture the different scale information in each block. Extensive experiments show that our MS-UMLP not only has the least number of parameters (only 48% parameters of a pure convolutional network) but also outperforms existing methods on the popular ACDC and Synapse medical image segmentation datasets.

Related Material


[pdf]
[bibtex]
@InProceedings{Xie_2024_ACCV, author = {Xie, Bin and Tang, Hao and Cai, Dawen and Yan, Yan}, title = {MS-UMLP: Medical Image Segmentation via Multi-Scale U-shape MLP-Mixer}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {December}, year = {2024}, pages = {1793-1808} }