A comparison of self-supervised speech representations as input features for unsupervised acoustic word embeddings

L Van Staden, H Kamper - 2021 IEEE Spoken Language …, 2021 - ieeexplore.ieee.org
Many speech processing tasks involve measuring the acoustic similarity between speech
segments. Acoustic word embeddings (AWE) allow for efficient comparisons by mapping …

Improving Acoustic Word Embeddings through Correspondence Training of Self-supervised Speech Representations

A Meghanani, T Hain - arXiv preprint arXiv:2403.08738, 2024 - arxiv.org
Acoustic word embeddings (AWEs) are vector representations of spoken words. An effective
method for obtaining AWEs is the Correspondence Auto-Encoder (CAE). In the past, the …

Analyzing acoustic word embeddings from pre-trained self-supervised speech models

R Sanabria, H Tang, S Goldwater - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org
Given the strong results of self-supervised models on various tasks, there have been
surprisingly few studies exploring self-supervised representations for acoustic word …

Self-supervised acoustic word embedding learning via correspondence transformer encoder

J Lin, X Yue, J Ao, H Li - arXiv preprint arXiv:2307.09871, 2023 - arxiv.org
Acoustic word embeddings (AWEs) aims to map a variable-length speech segment into a
fixed-dimensional representation. High-quality AWEs should be invariant to variations, such …

Layer-Wise Analysis of Self-Supervised Acoustic Word Embeddings: A Study on Speech Emotion Recognition

A Saliba, Y Li, R Sanabria, C Lai - arXiv preprint arXiv:2402.02617, 2024 - arxiv.org
The efficacy of self-supervised speech models has been validated, yet the optimal utilization
of their representations remains challenging across diverse tasks. In this study, we delve into …

Acoustic word embeddings for zero-resource languages using self-supervised contrastive learning and multilingual adaptation

C Jacobs, Y Matusevych… - 2021 IEEE Spoken …, 2021 - ieeexplore.ieee.org
Acoustic word embeddings (AWEs) are fixed-dimensional representations of variable-length
speech segments. For zero-resource languages where labelled data is not available, one …

Supervised acoustic embeddings and their transferability across languages

S Ram, H Aldarmaki - arXiv preprint arXiv:2301.01020, 2023 - arxiv.org
In speech recognition, it is essential to model the phonetic content of the input signal while
discarding irrelevant factors such as speaker variations and noise, which is challenging in …

Integrating form and meaning: A multi-task learning model for acoustic word embeddings

BM Abdullah, B Möbius, D Klakow - arXiv preprint arXiv:2209.06633, 2022 - arxiv.org
Models of acoustic word embeddings (AWEs) learn to map variable-length spoken word
segments onto fixed-dimensionality vector representations such that different acoustic …

Analyzing autoencoder-based acoustic word embeddings

Y Matusevych, H Kamper, S Goldwater - arXiv preprint arXiv:2004.01647, 2020 - arxiv.org
Recent studies have introduced methods for learning acoustic word embeddings (AWEs)---
fixed-size vector representations of words which encode their acoustic features. Despite the …

Multilingual jointly trained acoustic and written word embeddings

Y Hu, S Settle, K Livescu - arXiv preprint arXiv:2006.14007, 2020 - arxiv.org
Acoustic word embeddings (AWEs) are vector representations of spoken word segments.
AWEs can be learned jointly with embeddings of character sequences, to generate …