-
[pdf]
[bibtex]@InProceedings{Ciamarra_2025_WACV, author = {Ciamarra, Andrea and Caldelli, Roberto and Del Bimbo, Alberto}, title = {On the generalisation capability of local surface frames in detecting diffusion-based facial images}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV) Workshops}, month = {February}, year = {2025}, pages = {1402-1411} }
On the generalisation capability of local surface frames in detecting diffusion-based facial images
Abstract
Extraordinary unreal images can be realised with powerful AI techniques. Various tools available to everyone are able to recreate high quality contents especially generating entire fully synthetic images. Among the existing architectures diffusion-based models can easily produce any kind of images including human facial images by giving a prompt like a text. Such false contents are often used to spread disinformation and this raises concerns about people security. At the present it is getting hard to develop reliable instruments to distinguish between real and generated (even non-existing) people. Moreover the large amount of diffusion-based implementations poses the problem for such detectors to generalise on novel generative techniques. To address these issues we propose to investigate the capacity of a distinctive feature based on the image acquisition environment to individuate diffusion-based face images from the pristine ones. In fact generated images should not contain the characteristics that are proper of the acquisition phase performed through a real camera. Such inconsistencies can be highlighted by means of recently introduced local surface frames. This feature takes into account objects and surfaces involved in the scene which all impact the camera acquisition process along with further intrinsic information tied to the device as well as lighting and reflections affecting the entire scenario. The paper explores the ability of this feature to generalise towards different datasets and new generative methods unknown during training. Experimental results highlight that such a feature still provides significant levels of detection accuracy also in these cases.
Related Material