Exploring neural transducers for end-to-end speech recognition

J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com

Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …

被引用次数：320 相关文章所有 7 个版本

[PDF] mdpi.com

An overview of end-to-end automatic speech recognition

D Wang, X Wang, S Lv - Symmetry, 2019 - mdpi.com

Automatic speech recognition, especially large vocabulary continuous speech recognition,
is an important issue in the field of machine learning. For a long time, the hidden Markov …

被引用次数：250 相关文章所有 9 个版本

Speech-transformer: a no-recurrence sequence-to-sequence model for speech recognition

L Dong, S Xu, B Xu - 2018 IEEE international conference on …, 2018 - ieeexplore.ieee.org

Recurrent sequence-to-sequence models using encoder-decoder architecture have made
great progress in speech recognition task. However, they suffer from the drawback of slow …

被引用次数：1136 相关文章所有 4 个版本

[PDF] arxiv.org

Deep learning scaling is predictable, empirically

J Hestness, S Narang, N Ardalani, G Diamos… - arXiv preprint arXiv …, 2017 - arxiv.org

Deep learning (DL) creates impactful advances following a virtuous recipe: model
architecture search, creating large training data sets, and scaling computation. It is widely …

被引用次数：662 相关文章所有 6 个版本

[PDF] ieee.org

Deep learning enabled semantic communications with speech recognition and synthesis

Z Weng, Z Qin, X Tao, C Pan, G Liu… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

In this paper, we develop a deep learning based semantic communication system for
speech transmission, named DeepSC-ST. We take the speech recognition and speech …

被引用次数：78 相关文章所有 5 个版本

[PDF] arxiv.org

Developing real-time streaming transformer transducer for speech recognition on large-scale dataset

X Chen, Y Wu, Z Wang, S Liu… - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org

Recently, Transformer based end-to-end models have achieved great success in many
areas including speech recognition. However, compared to LSTM models, the heavy …

被引用次数：181 相关文章所有 3 个版本

[PDF] arxiv.org

The architectural implications of facebook's dnn-based personalized recommendation

U Gupta, CJ Wu, X Wang, M Naumov… - … Symposium on High …, 2020 - ieeexplore.ieee.org

The widespread application of deep learning has changed the landscape of computation in
data centers. In particular, personalized recommendation for content ranking is now largely …

被引用次数：287 相关文章所有 10 个版本

[PDF] arxiv.org

RWTH ASR Systems for LibriSpeech: Hybrid vs Attention--w/o Data Augmentation

C Lüscher, E Beck, K Irie, M Kitza, W Michel… - arXiv preprint arXiv …, 2019 - arxiv.org

We present state-of-the-art automatic speech recognition (ASR) systems employing a
standard hybrid DNN/HMM architecture compared to an attention-based encoder-decoder …

被引用次数：293 相关文章所有 10 个版本

[PDF] arxiv.org

Deep speech: Scaling up end-to-end speech recognition

A Hannun, C Case, J Casper, B Catanzaro… - arXiv preprint arXiv …, 2014 - arxiv.org

We present a state-of-the-art speech recognition system developed using end-to-end deep
learning. Our architecture is significantly simpler than traditional speech systems, which rely …

被引用次数：2606 相关文章所有 13 个版本

[PDF] rwth-aachen.de

A comparison of transformer and lstm encoder decoder models for asr

A Zeyer, P Bahar, K Irie, R Schlüter… - 2019 IEEE Automatic …, 2019 - ieeexplore.ieee.org

We present competitive results using a Transformer encoder-decoder-attention model for
end-to-end speech recognition needing less training time compared to a similarly …

被引用次数：235 相关文章所有 5 个版本

高级搜索

QQ 群