State of the art on monocular 3D face reconstruction, tracking, and applications

M Zollhöfer, J Thies, P Garrido, D Bradley… - Computer graphics …, 2018 - Wiley Online Library
The computer graphics and vision communities have dedicated long standing efforts in
building computerized tools for reconstructing, tracking, and analyzing human faces based …

Deep learning for visual speech analysis: A survey

C Sheng, G Kuang, L Bai, C Hou, Y Guo… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
Visual speech, referring to the visual domain of speech, has attracted increasing attention
due to its wide applications, such as public security, medical treatment, military defense, and …

Codetalker: Speech-driven 3d facial animation with discrete motion prior

J Xing, M Xia, Y Zhang, X Cun… - Proceedings of the …, 2023 - openaccess.thecvf.com
Speech-driven 3D facial animation has been widely studied, yet there is still a gap to
achieving realism and vividness due to the highly ill-posed nature and scarcity of audio …

Ad-nerf: Audio driven neural radiance fields for talking head synthesis

Y Guo, K Chen, S Liang, YJ Liu… - Proceedings of the …, 2021 - openaccess.thecvf.com
Generating high-fidelity talking head video by fitting with the input audio sequence is a
challenging problem that receives considerable attentions recently. In this paper, we …

Flow-guided one-shot talking face generation with a high-resolution audio-visual dataset

Z Zhang, L Li, Y Ding, C Fan - Proceedings of the IEEE/CVF …, 2021 - openaccess.thecvf.com
One-shot talking face generation should synthesize high visual quality facial videos with
reasonable animations of expression and head pose, and just utilize arbitrary driving audio …

Styleheat: One-shot high-resolution editable talking face generation via pre-trained stylegan

F Yin, Y Zhang, X Cun, M Cao, Y Fan, X Wang… - European conference on …, 2022 - Springer
One-shot talking face generation aims at synthesizing a high-quality talking face video from
an arbitrary portrait image, driven by a video or an audio segment. In this work, we provide a …

Diffused heads: Diffusion models beat gans on talking-face generation

M Stypułkowski, K Vougioukas, S He… - Proceedings of the …, 2024 - openaccess.thecvf.com
Talking face generation has historically struggled to produce head movements and natural
facial expressions without guidance from additional reference videos. Recent developments …

Live speech portraits: real-time photorealistic talking-head animation

Y Lu, J Chai, X Cao - ACM Transactions on Graphics (ToG), 2021 - dl.acm.org
To the best of our knowledge, we first present a live system that generates personalized
photorealistic talking-head animation only driven by audio signals at over 30 fps. Our system …

Makelttalk: speaker-aware talking-head animation

Y Zhou, X Han, E Shechtman, J Echevarria… - ACM Transactions On …, 2020 - dl.acm.org
We present a method that generates expressive talking-head videos from a single facial
image with audio as the only input. In contrast to previous attempts to learn direct mappings …

Meshtalk: 3d face animation from speech using cross-modality disentanglement

A Richard, M Zollhöfer, Y Wen… - Proceedings of the …, 2021 - openaccess.thecvf.com
This paper presents a generic method for generating full facial 3D animation from speech.
Existing approaches to audio-driven facial animation exhibit uncanny or static upper face …