Advances in joint CTC-attention based end-to-end speech recognition with a deep CNN encoder...

J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com

Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …

被引用次数：370 相关文章所有 7 个版本

[PDF] mdpi.com

An overview of end-to-end automatic speech recognition

D Wang, X Wang, S Lv - Symmetry, 2019 - mdpi.com

Automatic speech recognition, especially large vocabulary continuous speech recognition,
is an important issue in the field of machine learning. For a long time, the hidden Markov …

被引用次数：276 相关文章所有 9 个版本

[PDF] ieee.org

End-to-end speech recognition: A survey

R Prabhavalkar, T Hori, TN Sainath… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org

In the last decade of automatic speech recognition (ASR) research, the introduction of deep
learning has brought considerable reductions in word error rate of more than 50% relative …

被引用次数：113 相关文章所有 6 个版本

[PDF] arxiv.org

A comparative study on transformer vs rnn in speech applications

S Karita, N Chen, T Hayashi, T Hori… - 2019 IEEE automatic …, 2019 - ieeexplore.ieee.org

Sequence-to-sequence models have been widely used in end-to-end speech processing,
for example, automatic speech recognition (ASR), speech translation (ST), and text-to …

被引用次数：836 相关文章所有 10 个版本

[PDF] researchgate.net

Attention, please! A survey of neural attention models in deep learning

A de Santana Correia, EL Colombini - Artificial Intelligence Review, 2022 - Springer

In humans, Attention is a core property of all perceptual and cognitive operations. Given our
limited ability to process competing sources, attention mechanisms select, modulate, and …

被引用次数：192 相关文章所有 8 个版本

[PDF] sciendo.com

[PDF][PDF] Performance evaluation of deep neural networks applied to speech recognition: RNN, LSTM and GRU

A Shewalkar, D Nyavanandi, SA Ludwig - Journal of Artificial …, 2019 - sciendo.com

Abstract Deep Neural Networks (DNN) are nothing but neural networks with many hidden
layers. DNNs are becoming popular in automatic speech recognition tasks which combines …

被引用次数：495 相关文章所有 9 个版本

[PDF] ieee.org

Neuro-symbolic speech understanding in aircraft maintenance metaverse

A Siyaev, GS Jo - Ieee Access, 2021 - ieeexplore.ieee.org

In the emerging world of metaverses, it is essential for speech communication systems to be
aware of context to interact with virtual assets in the 3D world. This paper proposes the …

被引用次数：131 相关文章所有 3 个版本

[PDF] merl.com

Hybrid CTC/attention architecture for end-to-end speech recognition

S Watanabe, T Hori, S Kim, JR Hershey… - IEEE Journal of …, 2017 - ieeexplore.ieee.org

Conventional automatic speech recognition (ASR) based on a hidden Markov model
(HMM)/deep neural network (DNN) is a very complicated system consisting of various …

被引用次数：913 相关文章所有 8 个版本

[PDF] isca-archive.org

[PDF][PDF] Improving transformer-based end-to-end speech recognition with connectionist temporal classification and language model integration

T Nakatani - proc. INTERSPEECH, 2019 - isca-archive.org

The state-of-the-art neural network architecture named Transformer has been used
successfully for many sequence-tosequence transformation tasks. The advantage of this …

被引用次数：265 相关文章所有 8 个版本

[PDF] rwth-aachen.de

A comparison of transformer and lstm encoder decoder models for asr

A Zeyer, P Bahar, K Irie, R Schlüter… - 2019 IEEE Automatic …, 2019 - ieeexplore.ieee.org

We present competitive results using a Transformer encoder-decoder-attention model for
end-to-end speech recognition needing less training time compared to a similarly …

被引用次数：254 相关文章所有 5 个版本

高级搜索

QQ 群