DataFormer: Differential Additive Transformer for Lightweight Semantic Segmentation

Mian Muhammad Naeem Abid, Nancy Mehta, Zongwei Wu, Radu Timofte; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2025, pp. 820-831

Abstract


Self-attention has demonstrated remarkable effectiveness in modeling long-range dependencies for semantic segmentation. However, its quadratic complexity severely limits real-time performance, particularly on resource-constrained devices. Additionally, conventional self-attention mechanisms tend to overemphasize irrelevant features, leading to inefficient computations and increased noise. To address these limitations, we propose DataFormer, an efficient CNN-Transformer based framework for semantic segmentation. DataFormer replaces the traditional quadratic matrix multiplications in self-attention with linear element-wise operations, significantly reducing computational overhead while maintaining robust contextual modeling. Moreover, the proposed Differential Additive Linear Attention mechanism enhances attention computation by leveraging the difference between two distinct attention score sets, effectively filtering out irrelevant noise and improving segmentation accuracy. To further improve efficiency, DataFormer incorporates MobileNetV2 blocks and CNN-based Unified Feature Aggregation (UFA) block for lightweight feature extraction and adaptive local-global fusion, respectively. Extensive experiments on challenging datasets demonstrate that DataFormer achieves significant improvements in both segmentation accuracy and inference speed, making it an ideal solution for real-time semantic segmentation in practical, resource-limited environments.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Abid_2025_CVPR, author = {Abid, Mian Muhammad Naeem and Mehta, Nancy and Wu, Zongwei and Timofte, Radu}, title = {DataFormer: Differential Additive Transformer for Lightweight Semantic Segmentation}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2025}, pages = {820-831} }