Joint Spectral Image Reconstruction and Semantic Segmentation with Cooperative Unfolding

He, Zijun; Wang, Ping; Wang, Xiaodong; Chen, Chang; Yuan, Xin

Zijun He, Ping Wang, Xiaodong Wang, Chang Chen, Xin Yuan; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026, pp. 6910-6919

Abstract

Coded Aperture Snapshot Spectral Imaging (CASSI) is an emerging hyperspectral image (HSI) acquisition technique for downstream semantic segmentation. Due to the ill-posedness nature of CASSI systems, typical solutions are compelled to conduct a two-stage reconstruction-then-segmentation pipeline, namely viewing them as two separate tasks. However, we observe that such two tasks are interrelated and mutually reinforcing for representation learning, and thus separating them limits the overall accuracy and efficiency. To this end, we propose the first Cooperative Reconstruction-Segmentation Deep Unfolding Network (CRSDUN) to solve the reconstruction and segmentation tasks in parallel. To make the two mutually reinforcing, we introduce the Cross-Aggregated Super-Token Attention (CASTA) mechanism to enhance the representation interactions between HSI reconstruction and semantic segmentation. Extensive experiments on both synthetic and real-world HSI reconstruction-segmentation datasets demonstrate that our method achieves state-of-the-art in both spectral reconstruction and semantic segmentation. The code is available at \href https://github.com/zjhe02/CRSDUN https://github.com/zjhe02/CRSDUN .

Related Material

[pdf] [supp]

[bibtex]

@InProceedings{He_2026_CVPR, author = {He, Zijun and Wang, Ping and Wang, Xiaodong and Chen, Chang and Yuan, Xin}, title = {Joint Spectral Image Reconstruction and Semantic Segmentation with Cooperative Unfolding}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2026}, pages = {6910-6919} }