[PDF][PDF] Recent advances in end-to-end automatic speech recognition

J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com
Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …

An overview of end-to-end automatic speech recognition

D Wang, X Wang, S Lv - Symmetry, 2019 - mdpi.com
Automatic speech recognition, especially large vocabulary continuous speech recognition,
is an important issue in the field of machine learning. For a long time, the hidden Markov …

Speech-transformer: a no-recurrence sequence-to-sequence model for speech recognition

L Dong, S Xu, B Xu - 2018 IEEE international conference on …, 2018 - ieeexplore.ieee.org
Recurrent sequence-to-sequence models using encoder-decoder architecture have made
great progress in speech recognition task. However, they suffer from the drawback of slow …

Deep learning scaling is predictable, empirically

J Hestness, S Narang, N Ardalani, G Diamos… - arXiv preprint arXiv …, 2017 - arxiv.org
Deep learning (DL) creates impactful advances following a virtuous recipe: model
architecture search, creating large training data sets, and scaling computation. It is widely …

Deep learning enabled semantic communications with speech recognition and synthesis

Z Weng, Z Qin, X Tao, C Pan, G Liu… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
In this paper, we develop a deep learning based semantic communication system for
speech transmission, named DeepSC-ST. We take the speech recognition and speech …

Developing real-time streaming transformer transducer for speech recognition on large-scale dataset

X Chen, Y Wu, Z Wang, S Liu… - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
Recently, Transformer based end-to-end models have achieved great success in many
areas including speech recognition. However, compared to LSTM models, the heavy …

The architectural implications of facebook's dnn-based personalized recommendation

U Gupta, CJ Wu, X Wang, M Naumov… - … Symposium on High …, 2020 - ieeexplore.ieee.org
The widespread application of deep learning has changed the landscape of computation in
data centers. In particular, personalized recommendation for content ranking is now largely …

RWTH ASR Systems for LibriSpeech: Hybrid vs Attention--w/o Data Augmentation

C Lüscher, E Beck, K Irie, M Kitza, W Michel… - arXiv preprint arXiv …, 2019 - arxiv.org
We present state-of-the-art automatic speech recognition (ASR) systems employing a
standard hybrid DNN/HMM architecture compared to an attention-based encoder-decoder …

Deep speech: Scaling up end-to-end speech recognition

A Hannun, C Case, J Casper, B Catanzaro… - arXiv preprint arXiv …, 2014 - arxiv.org
We present a state-of-the-art speech recognition system developed using end-to-end deep
learning. Our architecture is significantly simpler than traditional speech systems, which rely …

A comparison of transformer and lstm encoder decoder models for asr

A Zeyer, P Bahar, K Irie, R Schlüter… - 2019 IEEE Automatic …, 2019 - ieeexplore.ieee.org
We present competitive results using a Transformer encoder-decoder-attention model for
end-to-end speech recognition needing less training time compared to a similarly …