Mutual-learning sequence-level knowledge distillation for automatic speech recognition

Z Li, Y Ming, L Yang, JH Xue - Neurocomputing, 2021 - Elsevier
Automatic speech recognition (ASR) is a crucial technology for man-machine interaction.
End-to-end models have been studied recently in deep learning for ASR. However, these …

TutorNet: Towards flexible knowledge distillation for end-to-end speech recognition

JW Yoon, H Lee, HY Kim, WI Cho… - IEEE/ACM Transactions …, 2021 - ieeexplore.ieee.org
In recent years, there has been a great deal of research in developing end-to-end speech
recognition models, which enable simplifying the traditional pipeline and achieving …

Knowledge distillation from multiple foundation models for end-to-end speech recognition

X Yang, Q Li, C Zhang, PC Woodland - arXiv preprint arXiv:2303.10917, 2023 - arxiv.org
Although large foundation models pre-trained by self-supervised learning have achieved
state-of-the-art performance in many tasks including automatic speech recognition (ASR) …

End-to-end automatic speech recognition with deep mutual learning

R Masumura, M Ihori, A Takashima… - 2020 Asia-Pacific …, 2020 - ieeexplore.ieee.org
This paper is the first study to apply deep mutual learning (DML) to end-to-end ASR models.
In DML, multiple models are trained simultaneously and collaboratively by mimicking each …

Knowledge distillation using output errors for self-attention end-to-end models

HG Kim, H Na, H Lee, J Lee, TG Kang… - ICASSP 2019-2019 …, 2019 - ieeexplore.ieee.org
Most automatic speech recognition (ASR) neural network models are not suitable for mobile
devices due to their large model sizes. Therefore, it is required to reduce the model size to …

Knowledge transfer from pre-trained language models to cif-based speech recognizers via hierarchical distillation

M Han, F Chen, J Shi, S Xu, B Xu - arXiv preprint arXiv:2301.13003, 2023 - arxiv.org
Large-scale pre-trained language models (PLMs) have shown great potential in natural
language processing tasks. Leveraging the capabilities of PLMs to enhance automatic …

Distilling knowledge from ensembles of acoustic models for joint CTC-attention end-to-end speech recognition

Y Gao, T Parcollet, ND Lane - 2021 IEEE Automatic Speech …, 2021 - ieeexplore.ieee.org
Knowledge distillation has been widely used to compress existing deep learning models
while preserving the performance on a wide range of applications. In the specific context of …

Inter-KD: Intermediate knowledge distillation for CTC-based automatic speech recognition

JW Yoon, BJ Woo, S Ahn, H Lee… - 2022 IEEE Spoken …, 2023 - ieeexplore.ieee.org
Recently, the advance in deep learning has brought a considerable improvement in the end-
to-end speech recognition field, simplifying the traditional pipeline while producing …

Comparison of soft and hard target rnn-t distillation for large-scale asr

D Hwang, KC Sim, Y Zhang… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
Knowledge distillation is an effective machine learning technique to transfer knowledge from
a teacher model to a smaller student model, especially with unlabeled data. In this paper, we …

Incremental learning for end-to-end automatic speech recognition

L Fu, X Li, L Zi, Z Zhang, Y Wu, X He… - 2021 IEEE Automatic …, 2021 - ieeexplore.ieee.org
In this paper, we propose an incremental learning method for end-to-end Automatic Speech
Recognition (ASR) which enables an ASR system to perform well on new tasks while …