Deriving translational acoustic sub-word embeddings

A Meghanani, T Hain - 2023 IEEE Automatic Speech …, 2023 - ieeexplore.ieee.org
2023 IEEE Automatic Speech Recognition and Understanding Workshop …, 2023ieeexplore.ieee.org
There is a growing interest in understanding the representational geometry of acoustic word
embeddings (AWEs), which are fixed-dimensional representations of spoken words.
However, not much research has been conducted on acoustic sub-word embeddings
(ASWEs), which can provide a better understanding of the AWE space. This work focuses on
decomposing AWEs to obtain ASWEs while retaining the ability to reconstruct AWEs by
translating ASWEs in the embedding space, under constrained settings. Initially, high-quality …
There is a growing interest in understanding the representational geometry of acoustic word embeddings (AWEs), which are fixed-dimensional representations of spoken words. However, not much research has been conducted on acoustic sub-word embeddings (ASWEs), which can provide a better understanding of the AWE space. This work focuses on decomposing AWEs to obtain ASWEs while retaining the ability to reconstruct AWEs by translating ASWEs in the embedding space, under constrained settings. Initially, high-quality AWEs are obtained with an Average Precision (AP) score of 0.97 on the word discrimination task. Subsequently, ASWEs are derived through the decomposition of AWEs. Three adapted versions of the AP metric, utilized for evaluating the quality of the derived ASWEs and their translational properties, are proposed. The results demonstrate that the derived ASWEs exhibit high quality, and the reconstruction of AWEs from the ASWEs is achievable by translating them in the embedding space.
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果