An image transform approach for HMM based automatic lipreading

G Potamianos, C Neti, G Gravier, A Garg… - Proceedings of the …, 2003 - ieeexplore.ieee.org

Visual speech information from the speaker's mouth region has been successfully shown to
improve noise robustness of automatic speech recognizers, thus promising to extend their …

被引用次数：962 相关文章所有 15 个版本

[PDF] researchgate.net

A review of recent advances in visual speech decoding

Z Zhou, G Zhao, X Hong, M Pietikäinen - Image and vision computing, 2014 - Elsevier

Visual speech information plays an important role in automatic speech recognition (ASR)
especially when audio is corrupted or even inaccessible. Despite the success of audio …

被引用次数：238 相关文章所有 4 个版本

[PDF] arxiv.org

Large-scale visual speech recognition

B Shillingford, Y Assael, MW Hoffman, T Paine… - arXiv preprint arXiv …, 2018 - arxiv.org

This work presents a scalable solution to open-vocabulary visual speech recognition. To
achieve this, we constructed the largest existing visual speech recognition dataset …

被引用次数：209 相关文章所有 7 个版本

[PDF] arxiv.org

LRW-1000: A naturally-distributed large-scale benchmark for lip reading in the wild

S Yang, Y Zhang, D Feng, M Yang… - 2019 14th IEEE …, 2019 - ieeexplore.ieee.org

Large-scale datasets have successively proven their fundamental importance in several
research fields, especially for early progress in some emerging topics. In this paper, we …

被引用次数：199 相关文章所有 10 个版本

[PDF] academia.edu

Audio-visual speech modeling for continuous speech recognition

S Dupont, J Luettin - IEEE transactions on multimedia, 2000 - ieeexplore.ieee.org

This paper describes a speech recognition system that uses both acoustic and visual
speech information to improve recognition performance in noisy environments. The system …

被引用次数：811 相关文章所有 11 个版本

[PDF] psu.edu

Lipreading with local spatiotemporal descriptors

G Zhao, M Barnard… - IEEE Transactions on …, 2009 - ieeexplore.ieee.org

Visual speech information plays an important role in lipreading under noisy conditions or for
listeners with a hearing impairment. In this paper, we present local spatiotemporal …

被引用次数：385 相关文章所有 16 个版本

[PDF] academia.edu

[PDF][PDF] Audio-visual automatic speech recognition: An overview

G Potamianos, C Neti, J Luettin… - Issues in visual and audio …, 2004 - academia.edu

We have made significant progress in automatic speech recognition (ASR) for well-defined
applications like dictation and medium vocabulary transaction processing tasks in relatively …

被引用次数：505 相关文章所有 5 个版本

[PDF] psu.edu

CUAVE: A new audio-visual database for multimodal human-computer interface research

EK Patterson, S Gurbuz, Z Tufekci… - 2002 IEEE International …, 2002 - ieeexplore.ieee.org

Multimodal signal processing has become an important topic of research for overcoming
certain problems of audio-only speech processing. Audio-visual speech recognition is one …

被引用次数：410 相关文章所有 10 个版本

[PDF] arxiv.org

LCANet: End-to-end lipreading with cascaded attention-CTC

K Xu, D Li, N Cassimatis, X Wang - 2018 13th IEEE …, 2018 - ieeexplore.ieee.org

Machine lipreading is a special type of automatic speech recognition (ASR) which
transcribes human speech by visually interpreting the movement of related face regions …

被引用次数：145 相关文章所有 6 个版本

[PDF] arxiv.org

Deep learning for visual speech analysis: A survey

C Sheng, G Kuang, L Bai, C Hou, Y Guo… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

Visual speech, referring to the visual domain of speech, has attracted increasing attention
due to its wide applications, such as public security, medical treatment, military defense, and …

被引用次数：42 相关文章所有 9 个版本

高级搜索

QQ 群