Semantic-aware implicit neural audio-driven video portrait generation

F Zhan, Y Yu, R Wu, J Zhang, S Lu, L Liu… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org

As information exists in various modalities in real world, effective interaction and fusion
among multimodal information plays a key role for the creation and perception of multimodal …

被引用次数：202 相关文章所有 11 个版本

[PDF] mdpi.com

Human-computer interaction system: A survey of talking-head generation

R Zhen, W Song, Q He, J Cao, L Shi, J Luo - Electronics, 2023 - mdpi.com

Virtual human is widely employed in various industries, including personal assistance,
intelligent customer service, and online education, thanks to the rapid development of …

被引用次数：42 相关文章所有 3 个版本

[PDF] thecvf.com

Humangaussian: Text-driven 3d human generation with gaussian splatting

X Liu, X Zhan, J Tang, Y Shan, G Zeng… - Proceedings of the …, 2024 - openaccess.thecvf.com

Realistic 3D human generation from text prompts is a desirable yet challenging task.
Existing methods optimize 3D representations like mesh or neural fields via score distillation …

被引用次数：44 相关文章所有 3 个版本

[PDF] thecvf.com

Gaussian head avatar: Ultra high-fidelity head avatar via dynamic gaussians

Y Xu, B Chen, Z Li, H Zhang, L Wang… - Proceedings of the …, 2024 - openaccess.thecvf.com

Creating high-fidelity 3D head avatars has always been a research hotspot but there
remains a great challenge under lightweight sparse view setups. In this paper we propose …

被引用次数：44 相关文章所有 3 个版本

[PDF] thecvf.com

Codetalker: Speech-driven 3d facial animation with discrete motion prior

J Xing, M Xia, Y Zhang, X Cun… - Proceedings of the …, 2023 - openaccess.thecvf.com

Speech-driven 3D facial animation has been widely studied, yet there is still a gap to
achieving realism and vividness due to the highly ill-posed nature and scarcity of audio …

被引用次数：113 相关文章所有 8 个版本

[PDF] thecvf.com

Expressive talking head generation with granular audio-visual control

B Liang, Y Pan, Z Guo, H Zhou… - Proceedings of the …, 2022 - openaccess.thecvf.com

Generating expressive talking heads is essential for creating virtual humans. However,
existing one-or few-shot methods focus on lip-sync and head motion, ignoring the emotional …

被引用次数：123 相关文章所有 4 个版本

[PDF] arxiv.org

Reconstructing personalized semantic facial nerf models from monocular video

X Gao, C Zhong, J Xiang, Y Hong, Y Guo… - ACM Transactions on …, 2022 - dl.acm.org

We present a novel semantic model for human head defined with neural radiance field. The
3D-consistent head model consist of a set of disentangled and interpretable bases, and can …

被引用次数：91 相关文章所有 3 个版本

[PDF] thecvf.com

Difftalk: Crafting diffusion models for generalized audio-driven portraits animation

S Shen, W Zhao, Z Meng, W Li… - Proceedings of the …, 2023 - openaccess.thecvf.com

Talking head synthesis is a promising approach for the video production industry. Recently,
a lot of effort has been devoted in this research area to improve the generation quality or …

被引用次数：73 相关文章所有 6 个版本

[PDF] thecvf.com

Learning hierarchical cross-modal association for co-speech gesture generation

X Liu, Q Wu, H Zhou, Y Xu, R Qian… - Proceedings of the …, 2022 - openaccess.thecvf.com

Generating speech-consistent body and gesture movements is a long-standing problem in
virtual avatar creation. Previous studies often synthesize pose movement in a holistic …

被引用次数：96 相关文章所有 5 个版本

[PDF] thecvf.com

Identity-preserving talking face generation with landmark and appearance priors

W Zhong, C Fang, Y Cai, P Wei… - Proceedings of the …, 2023 - openaccess.thecvf.com

Generating talking face videos from audio attracts lots of research interest. A few person-
specific methods can generate vivid videos but require the target speaker's videos for …

被引用次数：48 相关文章所有 5 个版本

高级搜索

QQ 群