查看文章

amazonaws.com 中的 [PDF]

An Improved Visual Speech Recognition of Isolated Words using Combined Pixel and Geometric Features

作者

N. Radha, A. Shahina, A. Nayeemulla Khan

发表日期

2016/12

期刊

Indian Journal of Science and Technology

卷号

期号

页码范围

1-7

出版商

indjst

简介

Objectives: This paper proposes a method to improve the performance of a Visual Speech Recognition (VSR) system by combining the pixel-based and geometry-based features, so as to augment the performance of audio based Automatic Speech Recognition (ASR) systems in adverse conditions. Methods/Statistical Analysis: A video database comprising of 11000 utterances of isolated words, collected from 20 speakers, is used in this study. Pixel based features (DCT and DWT) and geometric features (Active Shape Model or ASM) are fused at two levels, one at the feature level and the other at the decision level. A simple Gaussian mixture HMM word model is built for feature level fusion, while a two stream HMM model is built for decision level fusion. Findings: The VSR system built using the combined features shows a significant improvement in performance when compared to individual VSR systems built using pixel and geometric based features. The accuracy of the individual system is 76% for geometric features, 64% for DCT and 72% for DWT pixel-based features. The performance improves for combined features with an accuracy of 80% for ASM+ DCT and 84.7% for DWT+ ASM. A weighted decision level fusion result in further improvement, with an accuracy of 84% for ASM+ DCT and 92% for ASM+ DWT. Application/Improvements: The combined VSR could be preferred over individual pixel/geometric feature based systems to augment the performance of audio based Automatic Speech Recognition (ASR) systems in adverse conditions. Further studies on improving the VSR system, which could be used in lieu of audio-based ASR …

引用总数

被引用次数：8

201820192020202120221 2 2 1 2

学术搜索中的文章

An improved visual speech recognition of isolated words using combined pixel and geometric features

N Radha, A Shahina, AN Khan - Indian J. Sci. Technol, 2016

被引用次数：8 相关文章所有 3 个版本