Early word segmentation behind the mask

S Frota, J Pejovic, M Cruz, C Severino… - Frontiers in …, 2022 - frontiersin.org
Infants have been shown to rely both on auditory and visual cues when processing speech.
We investigated the impact of COVID-related changes, in particular of face masks, in early …

[HTML][HTML] Taris: An online speech recognition framework with sequence to sequence neural networks for both audio-only and audio-visual speech

G Sterpu, N Harte - Computer Speech & Language, 2022 - Elsevier
It is widely accepted that the visual modality of speech provides complementary information
to the speech recognition task, and many models have been introduced in order to make …

[PDF][PDF] Auditory and visual cues in face-masked infant-directed speech

M Cruz, J Pejovic, C Severino, M Vigário… - Proceedings of the …, 2022 - researchgate.net
Abstract Language includes auditory and visual cues relevant to language learning, and
infants have been shown to take advantage of those cues while processing speech. With …

AV Taris: Online Audio-Visual Speech Recognition

G Sterpu, N Harte - arXiv preprint arXiv:2012.07467, 2020 - arxiv.org
In recent years, Automatic Speech Recognition (ASR) technology has approached human-
level performance on conversational speech under relatively clean listening conditions. In …