DiffForensics: Leveraging Diffusion Prior to Image Forgery Detection and Localization

Zeqin Yu, Jiangqun Ni, Yuzhen Lin, Haoyi Deng, Bin Li; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 12765-12774

Abstract


As manipulating images may lead to misinterpretation of the visual content addressing the image forgery detection and localization (IFDL) problem has drawn serious public concerns. In this work we propose a simple assumption that the effective forensic method should focus on the mesoscopic properties of images. Based on the assumption a novel two-stage self-supervised framework leveraging the diffusion model for IFDL task i.e. DiffForensics is proposed in this paper. The DiffForensics begins with self-supervised denoising diffusion paradigm equipped with the module of encoder-decoder structure by freezing the pre-trained encoder (e.g. in ADE-20K) to inherit macroscopic features for general image characteristics while encouraging the decoder to learn microscopic feature representation of images enforcing the whole model to focus the mesoscopic representations. The pre-trained model as a prior is then further fine-tuned for IFDL task with the customized Edge Cue Enhancement Module (ECEM) which progressively highlights the boundary features within the manipulated regions thereby refining tampered area localization with better precision. Extensive experiments on several public challenging datasets demonstrate the effectiveness of the proposed method compared with other state-of-the-art methods. The proposed DiffForensics could significantly improve the model's capabilities for both accurate tamper detection and precise tamper localization while concurrently elevating its generalization and robustness.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Yu_2024_CVPR, author = {Yu, Zeqin and Ni, Jiangqun and Lin, Yuzhen and Deng, Haoyi and Li, Bin}, title = {DiffForensics: Leveraging Diffusion Prior to Image Forgery Detection and Localization}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {12765-12774} }