-
[pdf]
[arXiv]
[bibtex]@InProceedings{Rosi_2024_CVPR, author = {Rosi, Gabriele and Cuttano, Claudia and Cavagnero, Niccol\`o and Averta, Giuseppe and Cermelli, Fabio}, title = {The Revenge of BiSeNet: Efficient Multi-Task Image Segmentation}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {8066-8074} }
The Revenge of BiSeNet: Efficient Multi-Task Image Segmentation
Abstract
Recent advancements in image segmentation have focused on enhancing the efficiency of the models to meet the demands of real-time applications especially on edge devices. However existing research has primarily concentrated on single-task settings especially on semantic segmentation leading to redundant efforts and specialized architectures for different tasks. To address this limitation we propose a novel architecture for efficient multi-task image segmentation capable of handling various segmentation tasks without sacrificing efficiency or accuracy. We introduce BiSeNetFormer that leverages the efficiency of two-stream semantic segmentation architectures and it extends them into a mask classification framework. Our approach maintains the efficient spatial and context paths to capture detailed and semantic information respectively while leveraging an efficient transformed-based segmentation head that computes the binary masks and class probabilities. By seamlessly supporting multiple tasks namely semantic and panoptic segmentation BiSeNetFormer offers a versatile solution for multi-task segmentation. We evaluate our approach on popular datasets Cityscapes and ADE20K demonstrating impressive inference speeds while maintaining competitive accuracy compared to state-of-the-art architectures. Our results indicate that BiSeNetFormer represents a significant advancement towards fast efficient and multi-task segmentation networks bridging the gap between model efficiency and task adaptability.
Related Material