Zero-Shot Warning Generation for Misinformative Multimodal Content

Giovanni Pio Delvecchio, Huy Hong Nguyen, Isao Echizen; Proceedings of the Winter Conference on Applications of Computer Vision (WACV) Workshops, 2025, pp. 785-794

Abstract


The widespread prevalence of misinformation poses significant societal concerns. Out-of-context misinformation where authentic images are paired with false text is particularly deceptive and easily misleads audiences. Most existing detection methods primarily evaluate image-text consistency but often lack sufficient explanations which are essential for effectively debunking misinformation. We present a model that detects multimodal misinformation through cross-modality consistency checks requiring minimal training time. Additionally we propose a lightweight model that achieves competitive performance using only one-third of the parameters. We also introduce a dual-purpose zero-shot learning task for generating contextualized warnings enabling automated debunking and enhancing user comprehension. Qualitative and human evaluations of the generated warnings highlight both the potential and limitations of our approach.

Related Material


[pdf] [arXiv]
[bibtex]
@InProceedings{Delvecchio_2025_WACV, author = {Delvecchio, Giovanni Pio and Nguyen, Huy Hong and Echizen, Isao}, title = {Zero-Shot Warning Generation for Misinformative Multimodal Content}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV) Workshops}, month = {February}, year = {2025}, pages = {785-794} }