End-to-end visual speech recognition for small-scale datasets

Performing predefined tasks using the human–robot interaction on speech recognition for an industrial robot

MC Bingol, O Aydogmus - Engineering Applications of Artificial Intelligence, 2020 - Elsevier

People who are not experts in robotics can easily implement complex robotic applications by
using human–robot interaction (HRI). HRI systems require many complex operations such …

被引用次数：89 相关文章

[PDF] arxiv.org

Advances and challenges in deep lip reading

M Oghbaie, A Sabaghi, K Hashemifard… - arXiv preprint arXiv …, 2021 - arxiv.org

Driven by deep learning techniques and large-scale datasets, recent years have witnessed
a paradigm shift in automatic lip reading. While the main thrust of Visual Speech …

被引用次数：14 相关文章所有 3 个版本

[PDF] arxiv.org

End-to-end video-to-speech synthesis using generative adversarial networks

R Mira, K Vougioukas, P Ma, S Petridis… - IEEE transactions on …, 2022 - ieeexplore.ieee.org

Video-to-speech is the process of reconstructing the audio speech from a video of a spoken
utterance. Previous approaches to this task have relied on a two-step process where an …

被引用次数：55 相关文章所有 6 个版本

[PDF] thecvf.com

Lip-reading with densely connected temporal convolutional networks

P Ma, Y Wang, J Shen, S Petridis… - Proceedings of the …, 2021 - openaccess.thecvf.com

In this work, we present the Densely Connected Temporal Convolutional Network (DC-TCN)
for lip-reading of isolated words. Although Temporal Convolutional Networks (TCN) have …

被引用次数：69 相关文章所有 10 个版本

[HTML] sciencedirect.com

[HTML][HTML] Turkish lip-reading using Bi-LSTM and deep learning models

Ü Atila, F Sabaz - Engineering Science and Technology, an International …, 2022 - Elsevier

In recent years, lip-reading has been one of the studies whose importance has increased
considerably, especially with the spread of deep learning applications. In this topic …

被引用次数：24 相关文章所有 2 个版本

Visual speech recognition for small scale dataset using VGG16 convolution neural network

S Patilkulkarni - Multimedia Tools and Applications, 2021 - Springer

Visual speech recognition is a method that comprehends speech from speakers lip
movements and the speech is validated only by the shape and lip movement …

被引用次数：36 相关文章所有 4 个版本

[PDF] mdpi.com

Visual speech recognition for kannada language using vgg16 convolutional neural network

S Rudregowda, S Patil Kulkarni, G HL, V Ravi… - Acoustics, 2023 - mdpi.com

Visual speech recognition (VSR) is a method of reading speech by noticing the lip actions of
the narrators. Visual speech significantly depends on the visual features derived from the …

被引用次数：16 相关文章所有 10 个版本

E2E-V2SResNet: Deep residual convolutional neural networks for end-to-end video driven speech synthesis

N Saleem, J Gao, M Irfan, E Verdu, JP Fuente - Image and Vision …, 2022 - Elsevier

Speechreading which infers spoken message from a visually detected articulated facial
trend is a challenging task. In this paper, we propose an end-to-end ResNet (E2E-ResNet) …

被引用次数：13 相关文章所有 3 个版本

[PDF] ieee.org

A Comprehensive Review of Recent Advances in Deep Neural Networks for Lipreading with Sign Language Recognition

N Rathipriya, N Maheswari - IEEE Access, 2024 - ieeexplore.ieee.org

Lip reading is a form of “listening” to people that happens visually. It's also referred to as
“Speech reading.” This is done by observing the speaker's face and listening to the spoken …

Visual speech enhancement without a real visual stream

SB Hegde, KR Prajwal… - Proceedings of the …, 2021 - openaccess.thecvf.com

In this work, we re-think the task of speech enhancement in unconstrained real-world
environments. Current state-of-the-art methods use only the audio stream and are limited in …

被引用次数：20 相关文章所有 7 个版本

高级搜索

QQ 群