Diffusion models have shown remarkable success in a variety of downstream generative tasks, yet remain under-explored in the important and challenging expressive talking head …
Abstract We propose DiffSHEG a Diffusion-based approach for Speech-driven Holistic 3D Expression and Gesture generation. While previous works focused on co-speech gesture or …
C Xu, Y Liu, J Xing, W Wang, M Sun… - Proceedings of the …, 2024 - openaccess.thecvf.com
In this paper we abstract the process of people hearing speech extracting meaningful cues and creating various dynamically audio-consistent talking faces termed Listening and …
Abstract" Generalizability" is seen as the hallmark quality of a good deepfake detection model. However, standard out-of-domain evaluation datasets are very similar in form to the …
Y Liu, L Lin, F Yu, C Zhou, Y Li - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Audio-driven portrait animation aims to synthesize portrait videos that are conditioned by given audio. Animating high-fidelity and multimodal video portraits has a variety of …
A Melnik, M Miasayedzenkau… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
Our goal with this survey is to provide an overview of the state of the art deep learning methods for face generation and editing using StyleGAN. The survey covers the evolution of …
Multimodal-driven talking face generation refers to animating a portrait with the given pose, expression, and gaze transferred from the driving image and video, or estimated from the …
We aim to edit the lip movements in talking video according to the given speech while preserving the personal identity and visual details. The task can be decomposed into two …
FT Hong, L Shen, D Xu - IEEE Transactions on Pattern Analysis …, 2023 - ieeexplore.ieee.org
Predominant techniques on talking head generation largely depend on 2D information, including facial appearances and motions from input face images. Nevertheless, dense 3D …