Texture Shape and Order Matter: A New Transformer Design for Sequential DeepFake Detection

Yunfei Li, Yuezun Li, Xin Wang, Baoyuan Wu, Jiaran Zhou, Junyu Dong; Proceedings of the Winter Conference on Applications of Computer Vision (WACV), 2025, pp. 202-211

Abstract


Sequential DeepFake detection is an emerging task that predicts the manipulation sequence in order. Existing methods typically formulate it as an image-to-sequence problem employing conventional Transformer architectures. However these methods lack dedicated design and consequently result in limited performance. As such this paper describes a new Transformer design called TSOM by exploring three perspectives: Texture Shape and Order of Manipulations. Our method features four major improvements: we describe a new texture-aware branch that effectively captures subtle manipulation traces with a Diversiform Pixel Difference Attention module. Then we introduce a Multi-source Cross-attention module to seek deep correlations among spatial and sequential features enabling effective modeling of complex manipulation traces. To further enhance the cross-attention we describe a Shape-guided Gaussian mapping strategy providing initial priors of the manipulation shape. Finally observing that the subsequent manipulation in a sequence may influence traces left in the preceding one we intriguingly invert the prediction order from forward to backward leading to notable gains as expected. Extensive experimental results demonstrate that our method outperforms others by a large margin highlighting the superiority of our method.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Li_2025_WACV, author = {Li, Yunfei and Li, Yuezun and Wang, Xin and Wu, Baoyuan and Zhou, Jiaran and Dong, Junyu}, title = {Texture Shape and Order Matter: A New Transformer Design for Sequential DeepFake Detection}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)}, month = {February}, year = {2025}, pages = {202-211} }