LRWR: large-scale benchmark for lip reading in Russian language

D Ivanko, D Ryumin, A Karpov - Mathematics, 2023 - mdpi.com

This article provides a detailed review of recent advances in audio-visual speech
recognition (AVSR) methods that have been developed over the last decade (2013–2023) …

被引用次数：18 相关文章所有 5 个版本

Review on research progress of machine lip reading

G Pu, H Wang - The Visual Computer, 2023 - Springer

Abstract Machine lip reading recognizes text content through the speaker's lip motion
information. Lip reading has significant research and application value. With the continuous …

被引用次数：13 相关文章所有 2 个版本

Audio–visual speech recognition based on regulated transformer and spatio–temporal fusion strategy for driver assistive systems

D Ryumin, A Axyonov, E Ryumina, D Ivanko… - Expert Systems with …, 2024 - Elsevier

This article presents a research methodology for audio–visual speech recognition (AVSR) in
driver assistive systems. These systems necessitate ongoing interaction with drivers while …

被引用次数：11 相关文章

[PDF] arxiv.org

A multi-purpose audio-visual corpus for multi-modal Persian speech recognition: The Arman-AV dataset

J Peymanfard, S Heydarian, A Lashini, H Zeinali… - Expert Systems with …, 2024 - Elsevier

Automatic lip reading has advanced significantly in recent years. However, these methods
need large-scale datasets that are scarce for many low-resource languages. In this paper …

被引用次数：10 相关文章所有 4 个版本

[HTML] mdpi.com

[HTML][HTML] Cross-Attention Fusion of Visual and Geometric Features for Large-Vocabulary Arabic Lipreading

S Daou, A Ben-Hamadou, A Rekik, A Kallel - Technologies, 2025 - mdpi.com

Lipreading involves recognizing spoken words by analyzing the movements of the lips and
surrounding area using visual data. It is an emerging research topic with many potential …

被引用次数：1 相关文章所有 2 个版本

[PDF] mdpi.com

Visual Lip Reading Dataset in Turkish

A Berkol, T Tümer-Sivri, N Pervan-Akman, M Çolak… - Data, 2023 - mdpi.com

The promised dataset was obtained from daily Turkish words and phrases pronounced by
various people in videos posted on YouTube. The purpose of compiling the dataset was to …

被引用次数：8 相关文章所有 10 个版本

[PDF] ieee.org

A Comprehensive Review of Recent Advances in Deep Neural Networks for Lipreading with Sign Language Recognition

N Rathipriya, N Maheswari - IEEE Access, 2024 - ieeexplore.ieee.org

Lip reading is a form of “listening” to people that happens visually. It's also referred to as
“Speech reading.” This is done by observing the speaker's face and listening to the spoken …

AnnoTheia: A Semi-Automatic Annotation Toolkit for Audio-Visual Speech Technologies

JM Acosta-Triana, D Gimeno-Gómez… - arXiv preprint arXiv …, 2024 - arxiv.org

More than 7,000 known languages are spoken around the world. However, due to the lack
of annotated resources, only a small fraction of them are currently covered by speech …

被引用次数：1 相关文章所有 3 个版本

[PDF] techscience.cn

[PDF][PDF] Visual Lip-Reading for Quranic Arabic Alphabets and Words Using Deep Learning.

NF Aljohani, ES Jaha - Computer Systems Science & …, 2023 - cdn.techscience.cn

The continuing advances in deep learning have paved the way for several challenging
ideas. One such idea is visual lip-reading, which has recently drawn many research …

被引用次数：6 相关文章所有 2 个版本

An Enhancement in K-means Algorithm for Automatic Ultrasound Image Segmentation

L Panigrahi, RR Panigrahi - International Conference on Biomedical …, 2023 - Springer

Breast malignancy is a relatively frequent disease that affects people all over the world.
When interpreting the lesion component of medical images, inter-and intra-observer errors …

被引用次数：1 相关文章所有 2 个版本

高级搜索

QQ 群