A review of deep learning techniques for speech processing

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier
The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

A survey on neural speech synthesis

X Tan, T Qin, F Soong, TY Liu - arXiv preprint arXiv:2106.15561, 2021 - arxiv.org
Text to speech (TTS), or speech synthesis, which aims to synthesize intelligible and natural
speech given text, is a hot research topic in speech, language, and machine learning …

Generative pre-training for speech with autoregressive predictive coding

YA Chung, J Glass - ICASSP 2020-2020 IEEE International …, 2020 - ieeexplore.ieee.org
Learning meaningful and general representations from unannotated speech that are
applicable to a wide range of tasks remains challenging. In this paper we propose to use …

Lrspeech: Extremely low-resource speech synthesis and recognition

J Xu, X Tan, Y Ren, T Qin, J Li, S Zhao… - Proceedings of the 26th …, 2020 - dl.acm.org
Speech synthesis (text to speech, TTS) and recognition (automatic speech recognition, ASR)
are important speech tasks, and require a large amount of text and speech pairs for model …

Low-resource expressive text-to-speech using data augmentation

G Huybrechts, T Merritt, G Comini… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
While recent neural text-to-speech (TTS) systems perform remarkably well, they typically
require a substantial amount of recordings from the target speaker reading in the desired …

One model, many languages: Meta-learning for multilingual text-to-speech

T Nekvinda, O Dušek - arXiv preprint arXiv:2008.00768, 2020 - arxiv.org
We introduce an approach to multilingual speech synthesis which uses the meta-learning
concept of contextual parameter generation and produces natural-sounding multilingual …

A systematic review and analysis of multilingual data strategies in text-to-speech for low-resource languages

P Do, M Coler, J Dijkstra, E Klabbers - Interspeech 2021, 2021 - research.rug.nl
We provide a systematic review of past studies that use multilingual data for text-to-speech
(TTS) of low-resource languages (LRLs). We focus on the strategies used by these studies …

Open-source multi-speaker speech corpora for building Gujarati, Kannada, Malayalam, Marathi, Tamil and Telugu speech synthesis systems

F He, SHC Chu, O Kjartansson, C Rivera… - Proceedings of the …, 2020 - aclanthology.org
We present free high quality multi-speaker speech corpora for Gujarati, Kannada,
Malayalam, Marathi, Tamil and Telugu, which are six of the twenty two official languages of …

Requirements and motivations of low-resource speech synthesis for language revitalization

A Pine, D Wells, N Brinklow, P Littell… - Proceedings of the …, 2022 - aclanthology.org
This paper describes the motivation and development of speech synthesis systems for the
purposes of language revitalization. By building speech synthesis systems for three …

Language-agnostic meta-learning for low-resource text-to-speech with articulatory features

F Lux, NT Vu - arXiv preprint arXiv:2203.03191, 2022 - arxiv.org
While neural text-to-speech systems perform remarkably well in high-resource scenarios,
they cannot be applied to the majority of the over 6,000 spoken languages in the world due …