查看文章

mdpi.com 中的 [HTML]

Visual speech recognition for kannada language using vgg16 convolutional neural network

作者

Shashidhar Rudregowda, Sudarshan Patil Kulkarni, Gururaj HL, Vinayakumar Ravi, Moez Krichen

发表日期

2023/3/16

期刊

Acoustics

卷号

期号

页码范围

343-353

出版商

MDPI

简介

Visual speech recognition (VSR) is a method of reading speech by noticing the lip actions of the narrators. Visual speech significantly depends on the visual features derived from the image sequences. Visual speech recognition is a stimulating process that poses various challenging tasks to human machine-based procedures. VSR methods clarify the tasks by using machine learning. Visual speech helps people who are hearing impaired, laryngeal patients, and are in a noisy environment. In this research, authors developed our dataset for the Kannada Language. The dataset contained five words, which are Avanu, Bagge, Bari, Guruthu, Helida, and these words are randomly chosen. The average duration of each video is 1 s to 1.2 s. The machine learning method is used for feature extraction and classification. Here, authors applied VGG16 Convolution Neural Network for our custom dataset, and relu activation function is used to get an accuracy of 91.90% and the recommended system confirms the effectiveness of the system. The proposed output is compared with HCNN, ResNet-LSTM, Bi-LSTM, and GLCM-ANN, and evidenced the effectiveness of the recommended system.

引用总数

被引用次数：13

202320245 8

学术搜索中的文章

Visual speech recognition for kannada language using vgg16 convolutional neural network

S Rudregowda, S Patil Kulkarni, G HL, V Ravi… - Acoustics, 2023

被引用次数：13 相关文章所有 10 个版本