作者
Chengzhe Sun, Ehab AlBadawy, Timothy F Davison, Sarah R Robinson, Ming-Ching Chang, Siwei Lyu
发表日期
2023/11/15
图书
Adversarial Multimedia Forensics
页码范围
263-282
出版商
Springer Nature Switzerland
简介
Recent advancements in AI-synthesized human voices have increased the threat of impersonation and disinformation. Detecting synthetic human voices has become crucial to combat these challenges. In this work, we propose a novel approach for detecting synthetic human voices by leveraging the identification of artifacts generated by neural vocoders in audio signals. Neural vocoders are specialized neural networks synthesizing waveforms using temporal-frequency representations such as Mel-spectrograms. These vocoders form a critical component in most deepfake audio synthesis models. Therefore, identifying the presence of neural vocoder processing suggests that an audio sample may have been artificially generated. To harness the potential of vocoder artifacts for synthetic human voice detection, we introduce a binary-class RawNet2 model. This model shares the same front-end feature extractor with …
学术搜索中的文章
C Sun, E AlBadawy, TF Davison, SR Robinson… - Adversarial Multimedia Forensics, 2023