Speaker identity is one of the important characteristics of human speech. In voice conversion, we change the speaker identity from one to another, while keeping the linguistic …
Easy access to audio-visual content on social media, combined with the availability of modern tools such as Tensorflow or Keras, and open-source trained models, along with …
C Yan, X Ji, K Wang, Q Jiang, Z Jin, W Xu - ACM Computing Surveys, 2022 - dl.acm.org
Voice assistants (VA) have become prevalent on a wide range of personal devices such as smartphones and smart speakers. As companies build voice assistants with extra …
The social media revolution has produced a plethora of web services to which users can easily upload and share multimedia documents. Despite the popularity and convenience of …
Modern text-to-speech (TTS) and voice conversion (VC) systems produce natural sounding speech that questions the security of automatic speaker verification (ASV). This makes …
In recent years, automatic speaker verification (ASV) is used extensively for voice biometrics. This leads to an increased interest to secure these voice biometric systems for real-world …
H Farid - Journal of Online Trust and Safety, 2022 - tsjournal.org
Synthetic media—so-called deep fakes—have captured the imagination of some and struck fear in others. Although they vary in their form and creation, deep fakes refer to text, image …
T Tu, YJ Chen, C Yeh, HY Lee - arXiv preprint arXiv:1904.06508, 2019 - arxiv.org
End-to-end text-to-speech (TTS) has shown great success on large quantities of paired text plus speech data. However, laborious data collection remains difficult for at least 95% of the …
Generative machine learning models have made convincing voice synthesis a reality. While such tools can be extremely useful in applications where people consent to their voices …