[PDF][PDF] DNN-based ultrasound-to-speech conversion for a silent speech interface

TG Csapó, T Grósz, G Gosztolya, L Tóth, A Markó - 2017 - real.mtak.hu
In this paper we present our initial results in articulatory-toacoustic conversion based on
tongue movement recordings using Deep Neural Networks (DNNs). Despite the fact that …

Updating the silent speech challenge benchmark with deep learning

Y Ji, L Liu, H Wang, Z Liu, Z Niu, B Denby - Speech Communication, 2018 - Elsevier
Abstract The term “Silent Speech Interface” was introduced almost a decade ago to describe
speech communication systems using only non-acoustic sensors, such as …

DNN-based acoustic-to-articulatory inversion using ultrasound tongue imaging

D Porras, A Sepúlveda-Sepúlveda… - 2019 International Joint …, 2019 - ieeexplore.ieee.org
Speech sounds are produced as the coordinated movement of the speaking organs. There
are several available methods to model the relation of articulatory movements and the …

[PDF][PDF] Multi-Task Learning of Speech Recognition and Speech Synthesis Parameters for Ultrasound-based Silent Speech Interfaces.

L Tóth, G Gosztolya, T Grósz, A Markó, TG Csapó - INTERSPEECH, 2018 - inf.u-szeged.hu
Abstract Silent Speech Interface systems apply two different strategies to solve the
articulatory-to-acoustic conversion task. The recognition-and-synthesis approach applies …

Optimizing the ultrasound tongue image representation for residual network-based articulatory-to-acoustic mapping

TG Csapó, G Gosztolya, L Tóth, AH Shandiz, A Markó - Sensors, 2022 - mdpi.com
Within speech processing, articulatory-to-acoustic mapping (AAM) methods can apply
ultrasound tongue imaging (UTI) as an input.(Micro) convex transducers are mostly used …

F0 estimation for DNN-based ultrasound silent speech interfaces

T Grósz, G Gosztolya, L Tóth… - … on Acoustics, Speech …, 2018 - ieeexplore.ieee.org
State-of-the-art silent speech interface systems apply vocoders to generate the speech
signal directly from articulatory data. Most of these approaches concentrate on estimating …

Emerging exg-based nui inputs in extended realities: A bottom-up survey

KA Shatilov, D Chatzopoulos, LH Lee… - ACM Transactions on …, 2021 - dl.acm.org
Incremental and quantitative improvements of two-way interactions with ex tended realities
(XR) are contributing toward a qualitative leap into a state of XR ecosystems being efficient …

Audio and visual modality combination in speech processing applications

G Potamianos, E Marcheret, Y Mroueh, V Goel… - The Handbook of …, 2017 - dl.acm.org
Chances are that most of us have experienced difficulty in listening to our interlocutor during
face-to-face conversation while in highly noisy environments, such as next to heavy traffic or …

Autoencoder-based articulatory-to-acoustic mapping for ultrasound silent speech interfaces

G Gosztolya, Á Pintér, L Tóth, T Grósz… - … Joint Conference on …, 2019 - ieeexplore.ieee.org
When using ultrasound video as input, Deep Neural Network-based Silent Speech
Interfaces usually rely on the whole image to estimate the spectral parameters required for …

Human-inspired computational models for European Portuguese: a review

A Teixeira, S Silva - Language Resources and Evaluation, 2024 - Springer
This paper surveys human-inspired speech technologies developed for European
Portuguese and the computational models they integrate and made them possible. In this …