A comprehensive survey on automatic speech recognition using neural networks

AS Dhanjal, W Singh - Multimedia Tools and Applications, 2024 - Springer
The continuous development in Automatic Speech Recognition has grown and
demonstrated its enormous potential in Human Interaction Communication systems. It is …

Voice-based interaction for an aging population: a systematic review

S Pednekar, P Dhirawani, R Shah… - 2023 3rd …, 2023 - ieeexplore.ieee.org
In the past twenty years, voice-based systems have emerged as a technology with high
usability and accessibility. Research in the field of Human-Computer Interaction (HCI) has …

Speech technology for everyone: Automatic speech recognition for non-native english with transfer learning

T Shibano, X Zhang, MT Li, H Cho, P Sullivan… - arXiv preprint arXiv …, 2021 - arxiv.org
To address the performance gap of English ASR models on L2 English speakers, we
evaluate fine-tuning of pretrained wav2vec 2.0 models (Baevski et al., 2020; Xu et al., 2021) …

CommonAccent: Exploring Large Acoustic Pretrained Models for Accent Classification Based on Common Voice

J Zuluaga-Gomez, S Ahmed, D Visockas… - arXiv preprint arXiv …, 2023 - arxiv.org
Despite the recent advancements in Automatic Speech Recognition (ASR), the recognition
of accented speech still remains a dominant problem. In order to create more inclusive ASR …

Improving automatic speech recognition for non-native English with transfer learning and language model decoding

P Sullivan, T Shibano, M Abdul-Mageed - Analysis and Application of …, 2022 - Springer
ASR systems designed for native English (L1) usually underperform on non-native English
(L2). To address this performance gap,(1) we extend our previous work to investigate fine …

Using Character-Level Sequence-to-Sequence Model for Word Level Text Generation to Enhance Arabic Speech Recognition

MA Azim, W Hussein, NL Badr - IEEE Access, 2023 - ieeexplore.ieee.org
Owing to the linguistic richness of the Arabic language, which contains more than 6000
roots, building a reliable Arabic language model for Arabic speech recognition systems …

CAPTuring accents: An approach to personalize pronunciation training for learners with different L1 backgrounds

V Khaustova, E Pyshkin, V Khaustov, J Blake… - … Conference on Speech …, 2023 - Springer
This paper presents a novel approach to addressing the often-overlooked issue of
pronunciation instruction in language learning through a Computer-Assisted Pronunciation …

Data-driven personalisation of television content: a survey

L Nixon, J Foss, K Apostolidis, V Mezaris - Multimedia Systems, 2022 - Springer
This survey considers the vision of TV broadcasting where content is personalised and
personalisation is data-driven, looks at the AI and data technologies making this possible …

Evaluation of the effectiveness of preschool English learning applications based on touch and voice multimodal interaction technique

TSM Tengku Wook, SF Mat Noor… - Universal Access in the …, 2023 - Springer
The rising development in information and communication technology affected the rapid
presence of new interface designs to meet the needs of user interaction. This includes the …

Enhancing Communication Equity: Evaluation of an Automated Speech Recognition Application in Ghana

G Ayoka, G Barbareschi, R Cave… - Proceedings of the CHI …, 2024 - dl.acm.org
In Ghana people who struggle to articulate speech as a result of different conditions
experience barriers in interacting with others due to difficulties in being understood …