Performing predefined tasks using the human–robot interaction on speech recognition for an industrial robot

MC Bingol, O Aydogmus - Engineering Applications of Artificial Intelligence, 2020 - Elsevier
People who are not experts in robotics can easily implement complex robotic applications by
using human–robot interaction (HRI). HRI systems require many complex operations such …

Advances and challenges in deep lip reading

M Oghbaie, A Sabaghi, K Hashemifard… - arXiv preprint arXiv …, 2021 - arxiv.org
Driven by deep learning techniques and large-scale datasets, recent years have witnessed
a paradigm shift in automatic lip reading. While the main thrust of Visual Speech …

End-to-end video-to-speech synthesis using generative adversarial networks

R Mira, K Vougioukas, P Ma, S Petridis… - IEEE transactions on …, 2022 - ieeexplore.ieee.org
Video-to-speech is the process of reconstructing the audio speech from a video of a spoken
utterance. Previous approaches to this task have relied on a two-step process where an …

Lip-reading with densely connected temporal convolutional networks

P Ma, Y Wang, J Shen, S Petridis… - Proceedings of the …, 2021 - openaccess.thecvf.com
In this work, we present the Densely Connected Temporal Convolutional Network (DC-TCN)
for lip-reading of isolated words. Although Temporal Convolutional Networks (TCN) have …

[HTML][HTML] Turkish lip-reading using Bi-LSTM and deep learning models

Ü Atila, F Sabaz - Engineering Science and Technology, an International …, 2022 - Elsevier
In recent years, lip-reading has been one of the studies whose importance has increased
considerably, especially with the spread of deep learning applications. In this topic …

Visual speech recognition for small scale dataset using VGG16 convolution neural network

S Patilkulkarni - Multimedia Tools and Applications, 2021 - Springer
Visual speech recognition is a method that comprehends speech from speakers lip
movements and the speech is validated only by the shape and lip movement …

Visual speech recognition for kannada language using vgg16 convolutional neural network

S Rudregowda, S Patil Kulkarni, G HL, V Ravi… - Acoustics, 2023 - mdpi.com
Visual speech recognition (VSR) is a method of reading speech by noticing the lip actions of
the narrators. Visual speech significantly depends on the visual features derived from the …

E2E-V2SResNet: Deep residual convolutional neural networks for end-to-end video driven speech synthesis

N Saleem, J Gao, M Irfan, E Verdu, JP Fuente - Image and Vision …, 2022 - Elsevier
Speechreading which infers spoken message from a visually detected articulated facial
trend is a challenging task. In this paper, we propose an end-to-end ResNet (E2E-ResNet) …

A Comprehensive Review of Recent Advances in Deep Neural Networks for Lipreading with Sign Language Recognition

N Rathipriya, N Maheswari - IEEE Access, 2024 - ieeexplore.ieee.org
Lip reading is a form of “listening” to people that happens visually. It's also referred to as
“Speech reading.” This is done by observing the speaker's face and listening to the spoken …

Visual speech enhancement without a real visual stream

SB Hegde, KR Prajwal… - Proceedings of the …, 2021 - openaccess.thecvf.com
In this work, we re-think the task of speech enhancement in unconstrained real-world
environments. Current state-of-the-art methods use only the audio stream and are limited in …