An overview of deep-learning-based audio-visual speech enhancement and separation

D Michelsanti, ZH Tan, SX Zhang, Y Xu… - … on Audio, Speech …, 2021 - ieeexplore.ieee.org
Speech enhancement and speech separation are two related tasks, whose purpose is to
extract either one or more target speech signals, respectively, from a mixture of sounds …

Hearing speech sounds: top-down influences on the interface between audition and speech perception

MH Davis, IS Johnsrude - Hearing research, 2007 - Elsevier
This paper focuses on the cognitive and neural mechanisms of speech perception: the rapid,
and highly automatic processes by which complex time-varying speech signals are …

Biosignal-based spoken communication: A survey

T Schultz, M Wand, T Hueber… - … on Audio, Speech …, 2017 - ieeexplore.ieee.org
Speech is a complex process involving a wide range of biosignals, including but not limited
to acoustics. These biosignals-stemming from the articulators, the articulator muscle …

Speech perception at the interface of neurobiology and linguistics

D Poeppel, WJ Idsardi… - … Transactions of the …, 2008 - royalsocietypublishing.org
Speech perception consists of a set of computations that take continuously varying acoustic
waveforms as input and generate discrete representations that make contact with the lexical …

Learning individual speaking styles for accurate lip to speech synthesis

KR Prajwal, R Mukhopadhyay… - Proceedings of the …, 2020 - openaccess.thecvf.com
Humans involuntarily tend to infer parts of the conversation from lip movements when the
speech is absent or corrupted by external noise. In this work, we explore the task of lip to …

Alternatives to the combinatorial paradigm of linguistic theory based on domain general principles of human cognition

J Bybee, JL McClelland - 2005 - degruyter.com
It is argued that the principles needed to explain linguistic behavior are domain-general and
based on the impact that specific experiences have on the mental organization and …

[图书][B] Anomia: Theoretical and clinical aspects

M Laine, N Martin - 2023 - taylorfrancis.com
This important book provides a broad, integrated overview of current research on word-
finding deficit, anomia, the most common symptom of language dysfunction occurring after …

Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model

T Toda, AW Black, K Tokuda - Speech communication, 2008 - Elsevier
In this paper, we describe a statistical approach to both an articulatory-to-acoustic mapping
and an acoustic-to-articulatory inversion mapping without using phonetic information. The …

Vid2speech: speech reconstruction from silent video

A Ephrat, S Peleg - 2017 IEEE International Conference on …, 2017 - ieeexplore.ieee.org
Speechreading is a notoriously difficult task for humans to perform. In this paper we present
an end-to-end model based on a convolutional neural network (CNN) for generating an …

EMG-to-speech: Direct generation of speech from facial electromyographic signals

M Janke, L Diener - IEEE/ACM Transactions on Audio, Speech …, 2017 - ieeexplore.ieee.org
Silent speech interfaces are systems that enable speech communication even when an
acoustic signal is unavailable. Over the last years, public interest in such interfaces has …