T Hayashi, R Yamamoto, K Inoue… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org
This paper introduces a new end-to-end text-to-speech (E2E-TTS) toolkit named ESPnet- TTS, which is an extension of the open-source speech processing toolkit ESPnet. The toolkit …
This textbook provides both profound technological knowledge and a comprehensive treatment of essential topics in music processing and music information retrieval. Including …
This paper describes Asteroid, the PyTorch-based audio source separation toolkit for researchers. Inspired by the most successful neural source separation systems, it provides …
Z Akhtar, TL Pendyala, VS Athmakuri - Forensic Sciences, 2024 - mdpi.com
The revolutionary breakthroughs in Machine Learning (ML) and Artificial Intelligence (AI) are extensively being harnessed across a diverse range of domains, eg, forensic science …
SÖ Arık, H Jun, G Diamos - IEEE Signal Processing Letters, 2018 - ieeexplore.ieee.org
We propose the multi-head convolutional neural network (MCNN) for waveform synthesis from spectrograms. Nonlinear interpolation in MCNN is employed with transposed …
Fake audio detection is a growing concern and some relevant datasets have been designed for research. However, there is no standard public Chinese dataset under complex …
Generative deep learning techniques have invaded the public discourse recently. Despite the advantages, the applications to disinformation are concerning as the counter-measures …
In this article, we study the ability of deep neural networks (DNNs) to restore missing audio content based on its context, ie, inpaint audio gaps. We focus on a condition which has not …
X Wu, S Ma, C Shen, C Lin, Q Wang, Q Li… - 32nd USENIX Security …, 2023 - usenix.org
Prior researchers show that existing automatic speech recognition (ASR) systems are vulnerable to adversarial examples. Most existing adversarial attacks against ASR systems …