作者
N Radha, A Shahina, P Prabha, Preethi Sri BT, Nayeemulla Khan
发表日期
2018/11/1
期刊
Pattern recognition letters
卷号
115
页码范围
39-49
出版商
North-Holland
简介
This paper studies the effect of combining evidences from multiple modes of speech on the recognition of different categories of sounds. Multimodal speech recognition systems are built by combining the acoustic and visual cues from the (lip radiated) normal microphone speech, throat microphone speech and lip reading for the recognition of the highly confusable 145 consonant-vowel units of the Hindi language. The performance of the multimodal systems are compared with that of the unimodal systems for the recognition of sounds based on their place (POA) and manner of articulation (MOA) as well as their associated vowels. This comparison shows that though the multimodal ASR systems rely on the presence of complimentary speech-related acoustic and visual cues present in the different modes, not all evidences are complimentary. Bimodal systems that combines visual cues from lip reading are shown to …
引用总数
2018201920202021202220232024322121