CAMixerSR: Only Details Need More "Attention"

Yan Wang, Yi Liu, Shijie Zhao, Junlin Li, Li Zhang; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 25837-25846

Abstract


To satisfy the rapidly increasing demands on the large image (2K-8K) super-resolution (SR) prevailing methods follow two independent tracks: 1) accelerate existing networks by content-aware routing and 2) design better super-resolution networks via token mixer refining. Despite directness they encounter unavoidable defects (e.g. inflexible route or non-discriminative processing) limiting further improvements of quality-complexity trade-off. To erase the drawbacks we integrate these schemes by proposing a content-aware mixer (CAMixer) which assigns convolution for simple contexts and additional deformable window-attention for sparse textures. Specifically the CAMixer uses a learnable predictor to generate multiple bootstraps including offsets for windows warping a mask for classifying windows and convolutional attentions for endowing convolution with the dynamic property which modulates attention to include more useful textures self-adaptively and improves the representation capability of convolution. We further introduce a global classification loss to improve the accuracy of predictors. By simply stacking CAMixers we obtain CAMixerSR which achieves superior performance on large-image SR lightweight SR and omnidirectional-image SR.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Wang_2024_CVPR, author = {Wang, Yan and Liu, Yi and Zhao, Shijie and Li, Junlin and Zhang, Li}, title = {CAMixerSR: Only Details Need More ''Attention''}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {25837-25846} }