Multimodal interaction: A review

M Turk - Pattern recognition letters, 2014 - Elsevier
People naturally interact with the world multimodally, through both parallel and sequential
use of multiple perceptual modalities. Multimodal human–computer interaction has sought …

Toward an affect-sensitive multimodal human-computer interaction

M Pantic, LJM Rothkrantz - Proceedings of the IEEE, 2003 - ieeexplore.ieee.org
The ability to recognize affective states of a person we are communicating with is the core of
emotional intelligence. Emotional intelligence is a facet of human intelligence that has been …

TCD-TIMIT: An audio-visual corpus of continuous speech

N Harte, E Gillen - IEEE Transactions on Multimedia, 2015 - ieeexplore.ieee.org
Automatic audio-visual speech recognition currently lags behind its audio-only counterpart
in terms of major progress. One of the reasons commonly cited by researchers is the scarcity …

Recent advances in the automatic recognition of audiovisual speech

G Potamianos, C Neti, G Gravier, A Garg… - Proceedings of the …, 2003 - ieeexplore.ieee.org
Visual speech information from the speaker's mouth region has been successfully shown to
improve noise robustness of automatic speech recognizers, thus promising to extend their …

Learning in audio-visual context: A review, analysis, and new perspective

Y Wei, D Hu, Y Tian, X Li - arXiv preprint arXiv:2208.09579, 2022 - arxiv.org
Sight and hearing are two senses that play a vital role in human communication and scene
understanding. To mimic human perception ability, audio-visual learning, aimed at …

[PDF][PDF] Audio-visual automatic speech recognition: An overview

G Potamianos, C Neti, J Luettin… - Issues in visual and audio …, 2004 - academia.edu
We have made significant progress in automatic speech recognition (ASR) for well-defined
applications like dictation and medium vocabulary transaction processing tasks in relatively …

Dynamic Bayesian networks for audio-visual speech recognition

AV Nefian, L Liang, X Pi, X Liu, K Murphy - EURASIP Journal on …, 2002 - Springer
The use of visual features in audio-visual speech recognition (AVSR) is justified by both the
speech generation mechanism, which is essentially bimodal in audio and visual …

[PDF][PDF] Advanced applications of neural networks and artificial intelligence: A review

K Kumar, GSM Thakur - International journal of information …, 2012 - academia.edu
(AI). It also considers the integration of neural networks with other computing methods Such
as fuzzy logic to enhance the interpretation ability of data. Artificial Neural Networks is …

Deep learning for visual speech analysis: A survey

C Sheng, G Kuang, L Bai, C Hou, Y Guo… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
Visual speech, referring to the visual domain of speech, has attracted increasing attention
due to its wide applications, such as public security, medical treatment, military defense, and …

A coupled HMM for audio-visual speech recognition

AV Nefian, L Liang, X Pi, L Xiaoxiang… - … , Speech, and Signal …, 2002 - ieeexplore.ieee.org
In recent years several speech recognition systems that use visual together with audio
information showed significant increase in performance over the standard speech …