Dphubert: Joint distillation and pruning of self-supervised speech models

Y Peng, Y Sudo, S Muhammad, S Watanabe - arXiv preprint arXiv …, 2023 - arxiv.org
Self-supervised learning (SSL) has achieved notable success in many speech processing
tasks, but the large model size and heavy computational cost hinder the deployment …

Dinosr: Self-distillation and online clustering for self-supervised speech representation learning

AH Liu, HJ Chang, M Auli, WN Hsu… - Advances in Neural …, 2024 - proceedings.neurips.cc
In this paper, we introduce self-distillation and online clustering for self-supervised speech
representation learning (DinoSR) which combines masked language modeling, self …

Reducing barriers to self-supervised learning: Hubert pre-training with academic compute

W Chen, X Chang, Y Peng, Z Ni, S Maiti… - arXiv preprint arXiv …, 2023 - arxiv.org
Self-supervised learning (SSL) has led to great strides in speech processing. However, the
resources needed to train these models has become prohibitively large as they continue to …

Structured pruning of self-supervised pre-trained models for speech recognition and understanding

Y Peng, K Kim, F Wu, P Sridhar… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
Self-supervised speech representation learning (SSL) has shown to be effective in various
downstream tasks, but SSL models are usually large and slow. Model compression …

On compressing sequences for self-supervised speech models

Y Meng, HJ Chen, J Shi, S Watanabe… - 2022 IEEE Spoken …, 2023 - ieeexplore.ieee.org
Compressing self-supervised models has become increasingly necessary, as self-
supervised models become larger. While previous approaches have primarily focused on …

Ensemble knowledge distillation of self-supervised speech models

KP Huang, TH Feng, YK Fu, TY Hsu… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
Distilled self-supervised models have shown competitive performance and efficiency in
recent years. However, there is a lack of experience in jointly distilling multiple self …

Fine-tuning strategies for faster inference using speech self-supervised models: a comparative study

S Zaiem, R Algayres, T Parcollet… - … , Speech, and Signal …, 2023 - ieeexplore.ieee.org
Self-supervised learning (SSL) has allowed substantial progress in Automatic Speech
Recognition (ASR) performance in low-resource settings. In this context, it has been …

USM-Lite: Quantization and Sparsity Aware Fine-Tuning for Speech Recognition with Universal Speech Models

S Ding, D Qiu, D Rim, Y He, O Rybakov… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org
End-to-end automatic speech recognition (ASR) models have seen revolutionary quality
gains with the recent development of large-scale universal speech models (USM). However …

Match to win: Analysing sequences lengths for efficient self-supervised learning in speech and audio

Y Gaol, J Fernandez-Marques… - 2022 IEEE Spoken …, 2023 - ieeexplore.ieee.org
Self-supervised learning (SSL) has proven vital in speech and audio-related applications.
The paradigm trains a general model on unlabeled data that can later be used to solve …

EFFUSE: Efficient self-supervised feature fusion for E2E ASR in multilingual and low resource scenarios

T Srivastava, J Shi, W Chen, S Watanabe - arXiv preprint arXiv …, 2023 - arxiv.org
Self-Supervised Learning (SSL) models have demonstrated exceptional performance in
various speech tasks, particularly in low-resource and multilingual domains. Recent works …