MLAAD: The Multi-Language Audio Anti-Spoofing Dataset

NM Müller, P Kawa, WH Choong, E Casanova… - arXiv preprint arXiv …, 2024 - arxiv.org
Text-to-Speech (TTS) technology brings significant advantages, such as giving a voice to
those with speech impairments, but also enables audio deepfakes and spoofs. The former …

A robust audio deepfake detection system via multi-view feature

Y Yang, H Qin, H Zhou, C Wang, T Guo… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org
With the advancement of generative modeling techniques, synthetic human speech
becomes increasingly indistinguishable from real, and tricky challenges are elicited for the …

Whisper in Focus: Enhancing Stuttered Speech Classification with Encoder Layer Optimization

H Ameer, S Latif, R Latif, S Mukhtar - arXiv preprint arXiv:2311.05203, 2023 - arxiv.org
In recent years, advancements in the field of speech processing have led to cutting-edge
deep learning algorithms with immense potential for real-world applications. The automated …

DeepFake-O-Meter v2. 0: An Open Platform for DeepFake Detection

S Hou, Y Ju, C Sun, S Jia, L Ke, R Zhou… - arXiv preprint arXiv …, 2024 - arxiv.org
Deepfakes, as AI-generated media, have increasingly threatened media integrity and
personal privacy with realistic yet fake digital content. In this work, we introduce an open …

How Do Neural Spoofing Countermeasures Detect Partially Spoofed Audio?

T Liu, L Zhang, RK Das, Y Ma, R Tao, H Li - arXiv preprint arXiv …, 2024 - arxiv.org
Partially manipulating a sentence can greatly change its meaning. Recent work shows that
countermeasures (CMs) trained on partially spoofed audio can effectively detect such …

Retrieval-Augmented Audio Deepfake Detection

Z Kang, Y He, B Zhao, X Qu, J Peng, J Xiao… - Proceedings of the 2024 …, 2024 - dl.acm.org
With recent advances in speech synthesis including text-to-speech (TTS) and voice
conversion (VC) systems enabling the generation of ultra-realistic audio deepfakes, there is …

Adapting OpenAI's Whisper for Speech Recognition on Code-Switch Mandarin-English SEAME and ASRU2019 Datasets

Y Yang, Y Peng, X Zhong, H Huang… - arXiv preprint arXiv …, 2023 - arxiv.org
This paper details the experimental results of adapting the OpenAI's Whisper model for
Code-Switch Mandarin-English Speech Recognition (ASR) on the SEAME and ASRU2019 …

Fake Artificial Intelligence Generated Contents (FAIGC): A Survey of Theories, Detection Methods, and Opportunities

X Yu, Y Wang, Y Chen, Z Tao, D Xi, S Song… - arXiv preprint arXiv …, 2024 - arxiv.org
In recent years, generative artificial intelligence models, represented by Large Language
Models (LLMs) and Diffusion Models (DMs), have revolutionized content production …

A New Approach to Voice Authenticity

NM Müller, P Kawa, S Hu, M Neu, J Williams… - arXiv preprint arXiv …, 2024 - arxiv.org
Voice faking, driven primarily by recent advances in text-to-speech (TTS) synthesis
technology, poses significant societal challenges. Currently, the prevailing assumption is …

[PDF][PDF] Harder or Different? Understanding Generalization of Audio Deepfake Detection

NM Müller, N Evans, H Tak, P Sperl, K Böttinger - arXiv e-prints, 2024 - arxiv.org
Recent research has highlighted a key issue in speech deepfake detection: models trained
on one set of deepfakes perform poorly on others. The question arises: is this due to the …