Deep learning for visual speech analysis: A survey

C Sheng, G Kuang, L Bai, C Hou, Y Guo… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
Visual speech, referring to the visual domain of speech, has attracted increasing attention
due to its wide applications, such as public security, medical treatment, military defense, and …

Eamm: One-shot emotional talking face via audio-based emotion-aware motion model

X Ji, H Zhou, K Wang, Q Wu, W Wu, F Xu… - ACM SIGGRAPH 2022 …, 2022 - dl.acm.org
Although significant progress has been made to audio-driven talking face generation,
existing methods either neglect facial emotion or cannot be applied to arbitrary subjects. In …

Live speech portraits: real-time photorealistic talking-head animation

Y Lu, J Chai, X Cao - ACM Transactions on Graphics (ToG), 2021 - dl.acm.org
To the best of our knowledge, we first present a live system that generates personalized
photorealistic talking-head animation only driven by audio signals at over 30 fps. Our system …

Fsgan: Subject agnostic face swapping and reenactment

Y Nirkin, Y Keller, T Hassner - Proceedings of the IEEE/CVF …, 2019 - openaccess.thecvf.com
Abstract We present Face Swapping GAN (FSGAN) for face swapping and reenactment.
Unlike previous work, FSGAN is subject agnostic and can be applied to pairs of faces …

Neural Emotion Director: Speech-preserving semantic control of facial expressions in" in-the-wild" videos

FP Papantoniou, PP Filntisis… - Proceedings of the …, 2022 - openaccess.thecvf.com
In this paper, we introduce a novel deep learning method for photo-realistic manipulation of
the emotional state of actors in" in-the-wild" videos. The proposed method is based on a …

Context-aware talking-head video editing

S Yang, W Wang, J Ling, B Peng, X Tan… - Proceedings of the 31st …, 2023 - dl.acm.org
Talking-head video editing aims to efficiently insert, delete, and substitute the word of a pre-
recorded video through a text transcript editor. The key challenge for this task is obtaining an …

Stableface: Analyzing and improving motion stability for talking face generation

J Ling, X Tan, L Chen, R Li, Y Zhang… - IEEE Journal of …, 2023 - ieeexplore.ieee.org
While previous methods for speech-driven talking face generation have shown significant
advances in improving the visual and lip-sync quality of the synthesized videos, they have …

Personatalk: Bring attention to your persona in visual dubbing

L Zhang, S Liang, Z Ge, T Hu - SIGGRAPH Asia 2024 Conference …, 2024 - dl.acm.org
For audio-driven visual dubbing, it remains a considerable challenge to uphold and
highlight speaker's persona while synthesizing accurate lip synchronization. Existing …

MusicFace: Music-driven expressive singing face synthesis

P Liu, W Deng, H Li, J Wang, Y Zheng, Y Ding… - Computational Visual …, 2024 - Springer
It remains an interesting and challenging problem to synthesize a vivid and realistic singing
face driven by music. In this paper, we present a method for this task with natural motions for …

Ca-wav2lip: Coordinate attention-based speech to lip synthesis in the wild

KC Wang, J Zhang, J Huang, Q Li… - … on Smart Computing …, 2023 - ieeexplore.ieee.org
With the growing consumption of online visual contents, there is an urgent need for video
translation in order to reach a wider audience from around the world. However, the materials …