GAN vocoders are currently one of the state-of-the-art methods for building high-quality neural waveform generative models. However, most of their architectures require dozens of …
In order to efficiently transmit and store speech signals, speech codecs create a minimally redundant representation of the input signal which is then decoded at the receiver with the …
Neural vocoders have recently become popular in text-tospeech synthesis and voice conversion, increasing the demand for efficient neural vocoders. One successful approach is …
J Byun, S Shin, J Sung, S Beack, Y Park - Interspeech, 2022 - researchgate.net
In this paper, we propose a method of perceptually optimizing the deep neural network (DNN)-based speech coder using multi-time-scale perceptual loss functions. We utilize a …
We present Espresso, an open-source, modular, extensible end-to-end neural automatic speech recognition (ASR) toolkit based on the deep learning library PyTorch and the …
T Okamoto, T Toda, H Kawai - 2021 IEEE Automatic Speech …, 2021 - ieeexplore.ieee.org
Although a HiFi-GAN vocoder can synthesize high-fidelity speech waveforms in real time on CPUs, there is a tradeoff between synthesis quality and inference speed. To increase …
Z Liu, Y Qian - arXiv preprint arXiv:2106.13419, 2021 - arxiv.org
Recent studies have shown that neural vocoders based on generative adversarial network (GAN) can generate audios with high quality. While GAN based neural vocoders have …
Good speech quality has been achieved using waveform matching and parametric reconstruction coders. Recently developed very low bit rate generative codecs can …
Q Tian, Y Chen, Z Zhang, H Lu, L Chen, L Xie… - arXiv preprint arXiv …, 2020 - arxiv.org
Recently, GAN based speech synthesis methods, such as MelGAN, have become very popular. Compared to conventional autoregressive based methods, parallel structures …