Vbench: Comprehensive benchmark suite for video generative models

Z Huang, Y He, J Yu, F Zhang, C Si… - Proceedings of the …, 2024 - openaccess.thecvf.com
Video generation has witnessed significant advancements yet evaluating these models
remains a challenge. A comprehensive evaluation benchmark for video generation is …

Freeu: Free lunch in diffusion u-net

C Si, Z Huang, Y Jiang, Z Liu - Proceedings of the IEEE/CVF …, 2024 - openaccess.thecvf.com
In this paper we uncover the untapped potential of diffusion U-Net which serves as a" free
lunch" that substantially improves the generation quality on the fly. We initially investigate …

Lavie: High-quality video generation with cascaded latent diffusion models

Y Wang, X Chen, X Ma, S Zhou, Z Huang… - arXiv preprint arXiv …, 2023 - arxiv.org
This work aims to learn a high-quality text-to-video (T2V) generative model by leveraging a
pre-trained text-to-image (T2I) model as a basis. It is a highly desirable yet challenging task …

Videobooth: Diffusion-based video generation with image prompts

Y Jiang, T Wu, S Yang, C Si, D Lin… - Proceedings of the …, 2024 - openaccess.thecvf.com
Text-driven video generation witnesses rapid progress. However merely using text prompts
is not enough to depict the desired subject appearance that accurately aligns with users' …

ReVersion: Diffusion-based relation inversion from images

Z Huang, T Wu, Y Jiang, KCK Chan, Z Liu - arXiv preprint arXiv …, 2023 - arxiv.org
Diffusion models gain increasing popularity for their generative capabilities. Recently, there
have been surging needs to generate customized images by inverting diffusion models from …

Facecomposer: A unified model for versatile facial content creation

J Wang, K Zhao, Y Ma, S Zhang… - Advances in …, 2024 - proceedings.neurips.cc
This work presents FaceComposer, a unified generative model that accomplishes a variety
of facial content creation tasks, including text-conditioned face synthesis, text-guided face …

Unsupervised compositional concepts discovery with text-to-image generative models

N Liu, Y Du, S Li, JB Tenenbaum… - Proceedings of the …, 2023 - openaccess.thecvf.com
Text-to-image generative models have enabled high-resolution image synthesis across
different domains, but require users to specify the content they wish to generate. In this …

Diffusionavatars: Deferred diffusion for high-fidelity 3d head avatars

T Kirschstein, S Giebenhain… - Proceedings of the …, 2024 - openaccess.thecvf.com
DiffusionAvatars synthesizes a high-fidelity 3D head avatar of a person offering intuitive
control over both pose and expression. We propose a diffusion-based neural renderer that …

It's All About Your Sketch: Democratising Sketch Control in Diffusion Models

S Koley, AK Bhunia, D Sekhri, A Sain… - Proceedings of the …, 2024 - openaccess.thecvf.com
This paper unravels the potential of sketches for diffusion models addressing the deceptive
promise of direct sketch control in generative AI. We importantly democratise the process …

Towards a simultaneous and granular identity-expression control in personalized face generation

R Liu, B Ma, W Zhang, Z Hu, C Fan… - Proceedings of the …, 2024 - openaccess.thecvf.com
In human-centric content generation the pre-trained text-to-image models struggle to
produce user-wanted portrait images which retain the identity of individuals while exhibiting …