Revisiting Generalizability in Deepfake Detection: Improving Metrics and Stabilizing Transfer

Sarthak Kamat, Shruti Agarwal, Trevor Darrell, Anna Rohrbach; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, 2023, pp. 426-435

Abstract


"Generalizability" is seen as the hallmark quality of a good deepfake detection model. However, standard out-of-domain evaluation datasets are very similar in form to the training data and lag behind the advancements in modern synthesis methods, making them highly insufficient metrics for robustness. We extend the study of transfer performance of three state-of-the-art methods (that use spatial, temporal, and lip-reading features respectively) on four newer fake types released within the last year. Depending on the artifact modes they were trained on, detection methods fail in different scenarios. On diffusion fakes, the aforementioned methods get 96%, 75%, and 51% AUC respectively, whereas on talking-head fakes, the same methods get 80%, 99%, and 92% AUC. We compare various methods of combining spatial and temporal modalities through joint training and feature fusion in order to stabilize generalization performance. We also propose a new, randomized algorithm to synthesize videos that emulate diverse, visually apparent artifacts with implausibilities in human facial-structure. By testing deepfake detectors on highly randomized artifacts, we can measure the level to which detection networks have learned a strong model for "reality", as opposed to memorizing subtle artifact patterns.

Related Material


[pdf]
[bibtex]
@InProceedings{Kamat_2023_ICCV, author = {Kamat, Sarthak and Agarwal, Shruti and Darrell, Trevor and Rohrbach, Anna}, title = {Revisiting Generalizability in Deepfake Detection: Improving Metrics and Stabilizing Transfer}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops}, month = {October}, year = {2023}, pages = {426-435} }