M Kim, J Hong, YM Ro - Advances in Neural Information …, 2021 - proceedings.neurips.cc
In this paper, we propose a novel lip-to-speech generative adversarial network, Visual Context Attentional GAN (VCA-GAN), which can jointly model local and global lip …
Video-to-speech is the process of reconstructing the audio speech from a video of a spoken utterance. Previous approaches to this task have relied on a two-step process where an …
Lip reading has gained popularity due to the proliferation of emerging real-world applications. This article provides a comprehensive review of benchmark datasets available …
Video-to-speech synthesis (also known as lip-to-speech) refers to the translation of silent lip movements into the corresponding audio. This task has received an increasing amount of …
We introduce a novel speech synthesis system, called NAUTILUS, that can generate speech with a target voice either from a text input or a reference utterance of an arbitrary source …
R Zhang, M Chen, B Steeper, Y Li, Z Yan… - Proceedings of the …, 2021 - dl.acm.org
This paper presents SpeeChin, a smart necklace that can recognize 54 English and 44 Chinese silent speech commands. A customized infrared (IR) imaging system is mounted on …
Abstract Automatic Speaker Verification (ASV) systems are vulnerable to a variety of voice spoofing attacks, eg, replays, speech synthesis, etc. The imposters/fraudsters often use …
M Kim, J Hong, YM Ro - ICASSP 2023-2023 IEEE International …, 2023 - ieeexplore.ieee.org
Recent studies have shown impressive performance in Lip-to-speech synthesis that aims to reconstruct speech from visual information alone. However, they have been suffering from …
In this paper, we present a deep-learning-based framework for audio-visual speech inpainting, ie, the task of restoring the missing parts of an acoustic speech signal from …