Two-stage visual speech recognition for intensive care patients

Privacy-Preserving Speaker Recognition Using Radars for Context Estimation in Future Multi-Modal Hearing Assistive Technologies

M Farooq, Y Ge, A Qayyum, C Tang… - 2023 IEEE …, 2023 - ieeexplore.ieee.org

Speaker recognition (SR) from speech can help determine the environmental context in
multi-talker conversational scenarios to enable the design of context-aware multi-modal …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

SyncVSR: Data-Efficient Visual Speech Recognition with End-to-End Crossmodal Audio Token Synchronization

YJ Ahn, J Park, S Park, J Choi, KE Kim - arXiv preprint arXiv:2406.12233, 2024 - arxiv.org

Visual Speech Recognition (VSR) stands at the intersection of computer vision and speech
recognition, aiming to interpret spoken content from visual cues. A prominent challenge in …

[PDF] arxiv.org

Spatio-Temporal driven Attention Graph Neural Network with Block Adjacency matrix (STAG-NN-BA)

U Nazir, W Islam, M Taj - arXiv preprint arXiv:2303.14322, 2023 - arxiv.org

Despite the recent advances in deep neural networks, standard convolutional kernels limit
the applications of these networks to the Euclidean domain only. Considering the geodesic …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

LITEVSR: Efficient Visual Speech Recognition by Learning from Speech Representations of Unlabeled Data

H Laux, E Mededovic, A Hallawa… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org

This paper proposes a novel, resource-efficient approach to Visual Speech Recognition
(VSR) leveraging speech representations produced by any trained Automatic Speech …

[PDF][PDF] Spatio-Temporal driven Attention Graph Neural Network with Block Adjacency matrix (STAG-NN-BA) for Remote Land-use Change Detection

U Nazir, W Islam, M Taj, S Khalid - 2023 - cvlab.lums.edu.pk

Despite the recent advances in deep neural networks, standard convolutional kernels limit
the applications of these networks to the Euclidean domain only. Considering the geodesic …

HNet: A deep learning based hybrid network for speaker dependent visual speech recognition

V Chandrabanshi, S Domnic - International Journal of Hybrid … - content.iospress.com

Abstract Visual Speech Recognition (VSR) is a popular area in computer vision research,
attracting interest for its ability to precisely analyze lip motion and seamlessly convert them …

[PDF] researchgate.net

[PDF][PDF] Facial Expression Database of Autism Spectrum Disorder Children

FM Alamgir, SMH Saif, SM Hossain, A Al Hadi… - researchgate.net

The processing of face information relies on the quality of data resources and therefore the
dataset is crucial for image processing. While considering behavioural investigations of …

高级搜索

QQ 群