Is Synthetic Voice Detection Research Going Into the Right Direction?

Stefano Borzì, Oliver Giudice, Filippo Stanco, Dario Allegra; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2022, pp. 71-80

Abstract


Machine Learning, and in general Artificial Intelligence approaches, brought a great advance in each and every field of Computer Science increasing accuracy levels of predictors in any known problem. Indeed, this evolution enabled the construction of effective frameworks and solutions able to be used in investigative and forensics scenarios for detection of fakes and, in general, manipulations in multimedia contents. On the other hand, can we trust these systems? Is research activity going in the right direction? Are we just taking the low-hanging fruit without taking into account many real-case-in-the-wild situations? The purpose of this paper is to raise an alert to the research community in the specific context of synthetic voice detection, where data available for training is not big enough to give sufficient trust in the techniques available in the literature. To this aim, an exploratory investigation of the most common voice spoofing dataset was carried out and it was surprisingly easy to build simple classifiers without any Deep Learning techniques. Simple considerations on bitrate were sufficient to achieve an effective detection performance.

Related Material


[pdf]
[bibtex]
@InProceedings{Borzi_2022_CVPR, author = {Borz{\`\i}, Stefano and Giudice, Oliver and Stanco, Filippo and Allegra, Dario}, title = {Is Synthetic Voice Detection Research Going Into the Right Direction?}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2022}, pages = {71-80} }