Phonetic-and-semantic embedding of spoken words with applications in spoken content retrieval

YA Chung, J Glass - ICASSP 2020-2020 IEEE International …, 2020 - ieeexplore.ieee.org

Learning meaningful and general representations from unannotated speech that are
applicable to a wide range of tasks remains challenging. In this paper we propose to use …

被引用次数：205 相关文章所有 10 个版本

[PDF] arxiv.org

Truly unsupervised acoustic word embeddings using weak top-down constraints in encoder-decoder models

H Kamper - ICASSP 2019-2019 IEEE International Conference …, 2019 - ieeexplore.ieee.org

We investigate unsupervised models that can map a variable-duration speech segment to a
fixed-dimensional representation. In settings where unlabelled speech is the only available …

被引用次数：76 相关文章所有 5 个版本

[PDF] arxiv.org

Improved speech representations with multi-target autoregressive predictive coding

YA Chung, J Glass - arXiv preprint arXiv:2004.05274, 2020 - arxiv.org

Training objectives based on predictive coding have recently been shown to be very
effective at learning meaningful representations from unlabeled speech. One example is …

被引用次数：58 相关文章所有 9 个版本

Semantic association computation: a comprehensive survey

S Jabeen, X Gao, P Andreae - Artificial Intelligence Review, 2020 - Springer

Semantic association computation is the process of quantifying the strength of a semantic
connection between two textual units, based on different types of semantic relations …

被引用次数：18 相关文章所有 4 个版本

[PDF] arxiv.org

Acoustic word embeddings for zero-resource languages using self-supervised contrastive learning and multilingual adaptation

C Jacobs, Y Matusevych… - 2021 IEEE Spoken …, 2021 - ieeexplore.ieee.org

Acoustic word embeddings (AWEs) are fixed-dimensional representations of variable-length
speech segments. For zero-resource languages where labelled data is not available, one …

被引用次数：26 相关文章所有 8 个版本

[PDF] arxiv.org

Completely unsupervised speech recognition by a generative adversarial network harmonized with iteratively refined hidden markov models

KY Chen, CP Tsai, DR Liu, HY Lee, L Lee - arXiv preprint arXiv …, 2019 - arxiv.org

Producing a large annotated speech corpus for training ASR systems remains difficult for
more than 95% of languages all over the world which are low-resourced, but collecting a …

被引用次数：44 相关文章所有 6 个版本

Audio word2vec: Sequence-to-sequence autoencoding for unsupervised learning of audio segmentation and representation

YC Chen, SF Huang, H Lee, YH Wang… - … /ACM Transactions on …, 2019 - ieeexplore.ieee.org

In text, word2vec transforms each word into a fixed-size vector used as the basic component
in applications of natural language processing. Given a large collection of unannotated …

被引用次数：45 相关文章所有 3 个版本

[PDF] arxiv.org

Aipnet: Generative adversarial pre-training of accent-invariant networks for end-to-end speech recognition

YC Chen, Z Yang, CF Yeh, M Jain… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org

As one of the major sources in speech variability, accents have posed a grand challenge to
the robustness of speech recognition systems. In this paper, our goal is to build a unified end …

被引用次数：36 相关文章所有 3 个版本

[PDF] arxiv.org

Acoustically grounded word embeddings for improved acoustics-to-word speech recognition

S Settle, K Audhkhasi, K Livescu… - ICASSP 2019-2019 …, 2019 - ieeexplore.ieee.org

Direct acoustics-to-word (A2W) systems for end-to-end automatic speech recognition are
simpler to train, and more efficient to decode with, than sub-word systems. However, A2W …

被引用次数：39 相关文章所有 6 个版本

[PDF] arxiv.org

Multilingual acoustic word embedding models for processing zero-resource languages

H Kamper, Y Matusevych… - ICASSP 2020-2020 IEEE …, 2020 - ieeexplore.ieee.org

Acoustic word embeddings are fixed-dimensional representations of variable-length speech
segments. In settings where unlabelled speech is the only available resource, such …

被引用次数：29 相关文章所有 9 个版本

高级搜索

QQ 群