SCCA-Net: A Novel Network for Image Manipulation Localization Using Split-Channel Contextual Attention

Yan Xiang, Kaiqi Zhao, Haichang Yin; Proceedings of the Asian Conference on Computer Vision (ACCV), 2024, pp. 4473-4487

Abstract


This paper introduces SCCA-Net, an advanced end-to-end network designed specifically for Image Manipulation Localization (IML). SCCA-Net comprises four critical modules: Split-Channel Contextual Attention (SCCA), Extractor, Encoder, and Decoder. The SCCA module fuses dynamic frequency contextual features and similarity features extracted from RGB feature pyramids of images, overcoming the common shortcomings of existing attention-based IML technologies that typically overlook the frequency adaptation of contextual information. The SCCA modules pivotal component, the Parallel Dynamic Frequency Aggregator (PDFA), integrates Parallel Low-pass (PL) and Similarity Attention (SA) blocks to merge contextual and similarity vectors. The Extractor produces an RGB feature pyramid, channeling varied frequency features into the SCCA. The Encoder, utilizing Transformer, establishes a robust global feature representation. To reconstruct the predicted mask, the Decoder employs uniquely designed cascaded upsampling-convolution (Up-Conv) blocks. Rigorous testing demonstrates that SCCA-Net surpasses conventional models, achieving F1 score improvements of +14.3% on Coverage and +11.8% on CASIA, matching top performances on NIST2016. SCCA-Net pushes the field's boundaries and redefines the benchmarks for assessing IML.

Related Material


[pdf]
[bibtex]
@InProceedings{Xiang_2024_ACCV, author = {Xiang, Yan and Zhao, Kaiqi and Yin, Haichang}, title = {SCCA-Net: A Novel Network for Image Manipulation Localization Using Split-Channel Contextual Attention}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {December}, year = {2024}, pages = {4473-4487} }