A review of deep learning techniques for speech processing

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier
The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

[图书][B] Designing and evaluating language corpora: A practical framework for corpus representativeness

J Egbert, D Biber, B Gray - 2022 - books.google.com
Corpora are ubiquitous in linguistic research, yet to date, there has been no consensus on
how to conceptualize corpus representativeness and collect corpus samples. This …

speechocean762: An open-source non-native english speech corpus for pronunciation assessment

J Zhang, Z Zhang, Y Wang, Z Yan, Q Song… - arXiv preprint arXiv …, 2021 - arxiv.org
This paper introduces a new open-source speech corpus named" speechocean762"
designed for pronunciation assessment use, consisting of 5000 English utterances from 250 …

What all do audio transformer models hear? probing acoustic representations for language delivery and its structure

J Shah, YK Singla, C Chen, RR Shah - arXiv preprint arXiv:2101.00387, 2021 - arxiv.org
In recent times, BERT based transformer models have become an inseparable part of
the'tech stack'of text processing models. Similar progress is being observed in the speech …

Personalizing ASR for dysarthric and accented speech with limited data

J Shor, D Emanuel, O Lang, O Tuval, M Brenner… - arXiv preprint arXiv …, 2019 - arxiv.org
Automatic speech recognition (ASR) systems have dramatically improved over the last few
years. ASR systems are most often trained from'typical'speech, which means that …

[PDF][PDF] Explore wav2vec 2.0 for Mispronunciation Detection.

X Xu, Y Kang, S Cao, B Lin, L Ma - Interspeech, 2021 - isca-archive.org
This paper presents an initial attempt to use self-supervised learning for Mispronunciaiton
Detection. Unlike existing methods that use speech recognition corpus to train models, we …

[PDF][PDF] A Study on Fine-Tuning wav2vec2. 0 Model for the Task of Mispronunciation Detection and Diagnosis.

L Peng, K Fu, B Lin, D Ke, J Zhang - Interspeech, 2021 - isca-archive.org
Mispronunciation detection and diagnosis (MDD) technology is a key component of
computer-assisted pronunciation training system (CAPT). The mainstream method is based …

Automatic Pronunciation Assessment--A Review

YE Kheir, A Ali, SA Chowdhury - arXiv preprint arXiv:2310.13974, 2023 - arxiv.org
Pronunciation assessment and its application in computer-aided pronunciation training
(CAPT) have seen impressive progress in recent years. With the rapid growth in language …

SED-MDD: Towards sentence dependent end-to-end mispronunciation detection and diagnosis

Y Feng, G Fu, Q Chen, K Chen - ICASSP 2020-2020 IEEE …, 2020 - ieeexplore.ieee.org
A mispronunciation detection and diagnosis (MD&D) system typically consists of multiple
stages, such as an acoustic model, a language model and a Viterbi decoder. In order to …

A full text-dependent end to end mispronunciation detection and diagnosis with easy data augmentation techniques

K Fu, J Lin, D Ke, Y Xie, J Zhang, B Lin - arXiv preprint arXiv:2104.08428, 2021 - arxiv.org
Recently, end-to-end mispronunciation detection and diagnosis (MD&D) systems has
become a popular alternative to greatly simplify the model-building process of conventional …