[HTML][HTML] Advances in subword-based HMM-DNN speech recognition across languages

P Smit, S Virpioja, M Kurimo - Computer Speech & Language, 2021 - Elsevier
We describe a novel way to implement subword language models in speech recognition
systems based on weighted finite state transducers, hidden Markov models, and deep …

Open source automatic speech recognition for German

B Milde, A Köhn - Speech communication; 13th ITG …, 2018 - ieeexplore.ieee.org
High quality Automatic Speech Recognition (ASR) is a prerequisite for speech-based
applications and research. While state-of-the-art ASR software is freely available, the …

Automatic speech recognition systems: A survey of discriminative techniques

AP Kaur, A Singh, R Sachdeva… - Multimedia Tools and …, 2023 - search.proquest.com
In the subject of pattern recognition, speech recognition is an important study topic. The
authors give a detailed assessment of voice recognition strategies for several majority …

Improving end-to-end speech recognition with pronunciation-assisted sub-word modeling

H Xu, S Ding, S Watanabe - ICASSP 2019-2019 IEEE …, 2019 - ieeexplore.ieee.org
Most end-to-end speech recognition systems model text directly as a sequence of characters
or sub-words. Current approaches to sub-word extraction only consider character sequence …

Computational intelligence in processing of speech acoustics: a survey

A Singh, N Kaur, V Kukreja, V Kadyan… - Complex & Intelligent …, 2022 - Springer
Speech recognition of a language is a key area in the field of pattern recognition. This paper
presents a comprehensive survey on the speech recognition techniques for non-Indian and …

Automatic speech recognition with very large conversational finnish and estonian vocabularies

S Enarvi, P Smit, S Virpioja… - IEEE/ACM Transactions …, 2017 - ieeexplore.ieee.org
Today, the vocabulary size for language models in large vocabulary speech recognition is
typically several hundreds of thousands of words. While this is already sufficient in some …

Low resource comparison of attention-based and hybrid ASR exploiting wav2vec 2.0

A Rouhe, A Virkkunen, J Leinonen, M Kurimo - Interspeech, 2022 - research.aalto.fi
Low resource speech recognition can potentially benefit a lot from exploiting a pretrained
model such as wav2vec 2.0. These pretrained models have learned useful representations …

Dynamic acoustic unit augmentation with bpe-dropout for low-resource end-to-end speech recognition

A Laptev, A Andrusenko, I Podluzhny, A Mitrofanov… - Sensors, 2021 - mdpi.com
With the rapid development of speech assistants, adapting server-intended automatic
speech recognition (ASR) solutions to a direct device has become crucial. For on-device …

Automatic transcription challenges for Inuktitut, a low-resource polysynthetic language

V Gupta, G Boulianne - … of the Twelfth Language Resources and …, 2020 - aclanthology.org
We introduce the first attempt at automatic speech recognition (ASR) in Inuktitut, as a
representative for polysynthetic, low-resource languages, like many of the 900 Indigenous …

Morfessor EM+ Prune: Improved subword segmentation with expectation maximization and pruning

SA Grönroos, S Virpioja, M Kurimo - arXiv preprint arXiv:2003.03131, 2020 - arxiv.org
Data-driven segmentation of words into subword units has been used in various natural
language processing applications such as automatic speech recognition and statistical …