相关文章- 学术资源搜索

Noisy training for deep neural networks in speech recognition

S Yin, C Liu, Z Zhang, Y Lin, D Wang, J Tejedor… - EURASIP Journal on …, 2015 - Springer

Deep neural networks (DNNs) have gained remarkable success in speech recognition,
partially attributed to the flexibility of DNN models in learning complex patterns of speech …

被引用次数：150 相关文章所有 15 个版本

[PDF] arxiv.org

Source-Filter HiFi-GAN: Fast and pitch controllable high-fidelity neural vocoder

R Yoneyama, YC Wu, T Toda - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org

Our previous work, the unified source-filter GAN (uSFGAN) vocoder, introduced a novel
architecture based on the source-filter theory into the parallel waveform generative …

被引用次数：20 相关文章所有 4 个版本

[PDF] arxiv.org

Transformer-transducer: End-to-end speech recognition with self-attention

CF Yeh, J Mahadeokar, K Kalgaonkar, Y Wang… - arXiv preprint arXiv …, 2019 - arxiv.org

We explore options to use Transformer networks in neural transducer for end-to-end speech
recognition. Transformer networks use self-attention for sequence modeling and comes with …

被引用次数：165 相关文章所有 2 个版本

[PDF] ieee.org

Convolutional neural networks to enhance coded speech

Z Zhao, H Liu, T Fingscheidt - IEEE/ACM Transactions on …, 2018 - ieeexplore.ieee.org

Enhancing coded speech suffering from far-end acoustic background noise, quantization
noise, and potentially transmission errors is a challenging task. In this paper, we propose …

被引用次数：74 相关文章所有 5 个版本

[PDF] arxiv.org

APCodec: A Neural Audio Codec with Parallel Amplitude and Phase Spectrum Encoding and Decoding

Y Ai, XH Jiang, YX Lu, HP Du, ZH Ling - arXiv preprint arXiv:2402.10533, 2024 - arxiv.org

This paper introduces a novel neural audio codec targeting high waveform sampling rates
and low bitrates named APCodec, which seamlessly integrates the strengths of parametric …

被引用次数：3 相关文章所有 2 个版本

[PDF] arxiv.org

Small-footprint highway deep neural networks for speech recognition

L Lu, S Renals - IEEE/ACM Transactions on Audio, Speech …, 2017 - ieeexplore.ieee.org

State-of-the-art speech recognition systems typically employ neural network acoustic
models. However, compared to Gaussian mixture models, deep neural network (DNN) …

被引用次数：18 相关文章所有 5 个版本

[PDF] arxiv.org

Very deep convolutional neural networks for robust speech recognition

Y Qian, PC Woodland - 2016 IEEE spoken language …, 2016 - ieeexplore.ieee.org

This paper describes the extension and optimisation of our previous work on very deep
convolutional neural networks (CNNs) for effective recognition of noisy speech in the Aurora …

被引用次数：85 相关文章所有 3 个版本

[PDF] cslt.org

Noisy training for deep neural networks

X Meng, C Liu, Z Zhang, D Wang - 2014 IEEE China Summit & …, 2014 - ieeexplore.ieee.org

Deep neural networks (DNN) have gained remarkable success in speech recognition,
partially attributed to its flexibility in learning complex patterns of speech signals. This …

被引用次数：14 相关文章所有 4 个版本

[PDF] arxiv.org

EESEN: End-to-end speech recognition using deep RNN models and WFST-based decoding

Y Miao, M Gowayyed, F Metze - 2015 IEEE workshop on …, 2015 - ieeexplore.ieee.org

The performance of automatic speech recognition (ASR) has improved tremendously due to
the application of deep neural networks (DNNs). Despite this progress, building a new ASR …

被引用次数：947 相关文章所有 10 个版本

[PDF] arxiv.org

Self-attention transducers for end-to-end speech recognition

Z Tian, J Yi, J Tao, Y Bai, Z Wen - arXiv preprint arXiv:1909.13037, 2019 - arxiv.org

Recurrent neural network transducers (RNN-T) have been successfully applied in end-to-
end speech recognition. However, the recurrent structure makes it difficult for parallelization …

被引用次数：77 相关文章所有 6 个版本

高级搜索

QQ 群

Noisy training for deep neural networks in speech recognition

Source-Filter HiFi-GAN: Fast and pitch controllable high-fidelity neural vocoder

Transformer-transducer: End-to-end speech recognition with self-attention

Convolutional neural networks to enhance coded speech

APCodec: A Neural Audio Codec with Parallel Amplitude and Phase Spectrum Encoding and Decoding

Small-footprint highway deep neural networks for speech recognition

Very deep convolutional neural networks for robust speech recognition

Noisy training for deep neural networks

EESEN: End-to-end speech recognition using deep RNN models and WFST-based decoding

Self-attention transducers for end-to-end speech recognition

引用