-
[pdf]
[supp]
[bibtex]@InProceedings{Qin_2026_CVPR, author = {Qin, Shiyu and Zhang, Xinjie and Liu, Zhening and Wang, Jinpeng and Chen, Bin and Li, Jiawei and Ren, Yifan and Xia, Shu-Tao and Zhang, Jun}, title = {MambaSIC: Mamba-based Stereo Image Compression with Bi-directional Multi-reference Entropy Model}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2026}, pages = {5306-5315} }
MambaSIC: Mamba-based Stereo Image Compression with Bi-directional Multi-reference Entropy Model
Abstract
Stereo image compression (SIC) has become increasingly vital with its applications surging in fields such as 3D reconstruction and autonomous navigation. Previous methods leverage cross-attention to model inter-view redundancy and employ autoregressive entropy models to predict probability distributions, achieving impressive rate-distortion performance. However, they suffer from slow coding speed due to the quadratic complexity of cross-attention mechanisms and the spatial autoregressive iterations of the entropy models. To address these limitations, we propose MambaSIC, which introduces two key innovations. First, we propose a Mamba-based stereo visual state space block (stereo VSSB) that leverages its linear complexity and long-range modeling capabilities to more rapidly and efficiently capture redundancy information between the two views. Second, to accelerate the compression process and enhance the accuracy of probability estimation, we introduce a bi-directional multi-reference entropy model that utilizes a checkerboard partitioning strategy and the stereo VSSB to get rich inter-view priors. Experimental results demonstrate that our MambaSIC outperforms the state-of-the-art methods in both compression performance and efficiency.
Related Material

