Noisy training for deep neural networks in speech recognition

S Yin, C Liu, Z Zhang, Y Lin, D Wang, J Tejedor… - EURASIP Journal on …, 2015 - Springer
Deep neural networks (DNNs) have gained remarkable success in speech recognition,
partially attributed to the flexibility of DNN models in learning complex patterns of speech …

Source-Filter HiFi-GAN: Fast and pitch controllable high-fidelity neural vocoder

R Yoneyama, YC Wu, T Toda - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org
Our previous work, the unified source-filter GAN (uSFGAN) vocoder, introduced a novel
architecture based on the source-filter theory into the parallel waveform generative …

Transformer-transducer: End-to-end speech recognition with self-attention

CF Yeh, J Mahadeokar, K Kalgaonkar, Y Wang… - arXiv preprint arXiv …, 2019 - arxiv.org
We explore options to use Transformer networks in neural transducer for end-to-end speech
recognition. Transformer networks use self-attention for sequence modeling and comes with …

Convolutional neural networks to enhance coded speech

Z Zhao, H Liu, T Fingscheidt - IEEE/ACM Transactions on …, 2018 - ieeexplore.ieee.org
Enhancing coded speech suffering from far-end acoustic background noise, quantization
noise, and potentially transmission errors is a challenging task. In this paper, we propose …

APCodec: A Neural Audio Codec with Parallel Amplitude and Phase Spectrum Encoding and Decoding

Y Ai, XH Jiang, YX Lu, HP Du, ZH Ling - arXiv preprint arXiv:2402.10533, 2024 - arxiv.org
This paper introduces a novel neural audio codec targeting high waveform sampling rates
and low bitrates named APCodec, which seamlessly integrates the strengths of parametric …

Small-footprint highway deep neural networks for speech recognition

L Lu, S Renals - IEEE/ACM Transactions on Audio, Speech …, 2017 - ieeexplore.ieee.org
State-of-the-art speech recognition systems typically employ neural network acoustic
models. However, compared to Gaussian mixture models, deep neural network (DNN) …

Very deep convolutional neural networks for robust speech recognition

Y Qian, PC Woodland - 2016 IEEE spoken language …, 2016 - ieeexplore.ieee.org
This paper describes the extension and optimisation of our previous work on very deep
convolutional neural networks (CNNs) for effective recognition of noisy speech in the Aurora …

Noisy training for deep neural networks

X Meng, C Liu, Z Zhang, D Wang - 2014 IEEE China Summit & …, 2014 - ieeexplore.ieee.org
Deep neural networks (DNN) have gained remarkable success in speech recognition,
partially attributed to its flexibility in learning complex patterns of speech signals. This …

EESEN: End-to-end speech recognition using deep RNN models and WFST-based decoding

Y Miao, M Gowayyed, F Metze - 2015 IEEE workshop on …, 2015 - ieeexplore.ieee.org
The performance of automatic speech recognition (ASR) has improved tremendously due to
the application of deep neural networks (DNNs). Despite this progress, building a new ASR …

Self-attention transducers for end-to-end speech recognition

Z Tian, J Yi, J Tao, Y Bai, Z Wen - arXiv preprint arXiv:1909.13037, 2019 - arxiv.org
Recurrent neural network transducers (RNN-T) have been successfully applied in end-to-
end speech recognition. However, the recurrent structure makes it difficult for parallelization …