Insights into machine lip reading

A Fernandez-Lopez, FM Sukno - Image and Vision Computing, 2018 - Elsevier

In the last few years, there has been an increasing interest in developing systems for
Automatic Lip-Reading (ALR). Similarly to other computer vision applications, methods …

被引用次数：142 相关文章所有 3 个版本

Review on research progress of machine lip reading

G Pu, H Wang - The Visual Computer, 2023 - Springer

Abstract Machine lip reading recognizes text content through the speaker's lip motion
information. Lip reading has significant research and application value. With the continuous …

被引用次数：11 相关文章所有 2 个版本

[PDF] uea.ac.uk

Improved speaker independent lip reading using speaker adaptive training and deep neural networks

I Almajai, S Cox, R Harvey, Y Lan - 2016 IEEE International …, 2016 - ieeexplore.ieee.org

Recent improvements in tracking and feature extraction mean that speaker-dependent lip-
reading of continuous speech using a medium size vocabulary (around 1000 words) is …

被引用次数：96 相关文章所有 11 个版本

Cromm-vsr: Cross-modal memory augmented visual speech recognition

M Kim, J Hong, SJ Park, YM Ro - IEEE Transactions on …, 2021 - ieeexplore.ieee.org

Visual Speech Recognition (VSR) is a task that recognizes speech from external
appearances of the face (, lips) into text. Since the information from the visual lip movements …

被引用次数：30 相关文章所有 3 个版本

[PDF] arxiv.org

Towards estimating the upper bound of visual-speech recognition: The visual lip-reading feasibility database

A Fernandez-Lopez, O Martinez… - 2017 12th IEEE …, 2017 - ieeexplore.ieee.org

Speech is the most used communication method between humans and it involves the
perception of auditory and visual channels. Automatic speech recognition focuses on …

被引用次数：53 相关文章所有 8 个版本

[PDF] uea.ac.uk

Generating intelligible audio speech from visual speech

T Le Cornu, B Milner - IEEE/ACM Transactions on Audio …, 2017 - ieeexplore.ieee.org

This paper is concerned with generating intelligible audio speech from a video of a person
talking. Regression and classification methods are proposed first to estimate static spectral …

被引用次数：49 相关文章所有 9 个版本

[PDF] arxiv.org

Decoding visemes: Improving machine lip-reading

HL Bear, R Harvey - 2016 IEEE International Conference on …, 2016 - ieeexplore.ieee.org

To undertake machine lip-reading, we try to recognise speech from a visual signal. Current
work often uses viseme classification supported by language models with varying degrees …

被引用次数：55 相关文章所有 19 个版本

[PDF] forth.gr

A survey on mouth modeling and analysis for sign language recognition

E Antonakos, A Roussos… - 2015 11th IEEE …, 2015 - ieeexplore.ieee.org

Around 70 million Deaf worldwide use Sign Languages (SLs) as their native languages. At
the same time, they have limited reading/writing skills in the spoken language. This puts …

被引用次数：44 相关文章所有 15 个版本

[PDF] mdpi.com

An effective conversion of visemes to words for high-performance automatic lipreading

S Fenghour, D Chen, K Guo, B Li, P Xiao - Sensors, 2021 - mdpi.com

As an alternative approach, viseme-based lipreading systems have demonstrated promising
performance results in decoding videos of people uttering entire sentences. However, the …

被引用次数：11 相关文章所有 11 个版本

[PDF] ieee.org

Large-scale unsupervised audio pre-training for video-to-speech synthesis

T Kefalas, Y Panagakis, M Pantic - IEEE/ACM Transactions on …, 2024 - ieeexplore.ieee.org

Video-to-speech synthesis is the task of reconstructing the speech signal from a silent video
of a speaker. Previous approaches train on data from almost exclusively audio-visual …

被引用次数：1 相关文章所有 5 个版本

高级搜索

QQ 群