While textless Spoken Language Models (SLMs) have shown potential in end-to-end speech-to-speech modeling, they still lag behind text-based Large Language Models …
R Komatsu, T Shinozaki - 2024 IEEE Spoken Language …, 2024 - ieeexplore.ieee.org
Self-supervised speech representation learning has become essential for extracting meaningful features from untranscribed audio. Recent advances highlight the potential of …
T Maekaku, J Shi, X Chang, Y Fujita… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org
Recently, the usefulness of self-supervised representation learning (SSRL) methods has been confirmed in various downstream tasks. Many of these models, as exemplified by …
HC Fang, NX Ye, YJ Shih, P Peng, HF Wang… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent advances in self-supervised speech models have shown significant improvement in many downstream tasks. However, these models predominantly centered on frame-level …