作者
Haizhou Li, Bin Ma, Chin-Hui Lee
发表日期
2006/12/19
期刊
IEEE Transactions on Audio, Speech, and Language Processing
卷号
15
期号
1
页码范围
271-284
出版商
IEEE
简介
We propose a novel approach to automatic spoken language identification (LID) based on vector space modeling (VSM). It is assumed that the overall sound characteristics of all spoken languages can be covered by a universal collection of acoustic units, which can be characterized by the acoustic segment models (ASMs). A spoken utterance is then decoded into a sequence of ASM units. The ASM framework furthers the idea of language-independent phone models for LID by introducing an unsupervised learning procedure to circumvent the need for phonetic transcription. Analogous to representing a text document as a term vector, we convert a spoken utterance into a feature vector with its attributes representing the co-occurrence statistics of the acoustic units. As such, we can build a vector space classifier for LID. The proposed VSM approach leads to a discriminative classifier backend, which is demonstrated …
引用总数
2006200720082009201020112012201320142015201620172018201920202021202220232024114281920342622182021208101011451
学术搜索中的文章
H Li, B Ma, CH Lee - IEEE Transactions on Audio, Speech, and Language …, 2006