Nuwa-xl: Diffusion over diffusion for extremely long video generation

Z Xing, Q Feng, H Chen, Q Dai, H Hu, H Xu… - ACM Computing …, 2024 - dl.acm.org

The recent wave of AI-generated content (AIGC) has witnessed substantial success in
computer vision, with the diffusion model playing a crucial role in this achievement. Due to …

被引用次数：78 相关文章所有 3 个版本

[PDF] arxiv.org

Show-1: Marrying pixel and latent diffusion models for text-to-video generation

DJ Zhang, JZ Wu, JW Liu, R Zhao, L Ran, Y Gu… - International Journal of …, 2024 - Springer

Significant advancements have been achieved in the realm of large-scale pre-trained text-to-
video Diffusion Models (VDMs). However, previous methods either rely solely on pixel …

被引用次数：156 相关文章所有 2 个版本

[PDF] arxiv.org

Motionctrl: A unified and flexible motion controller for video generation

Z Wang, Z Yuan, X Wang, Y Li, T Chen, M Xia… - ACM SIGGRAPH 2024 …, 2024 - dl.acm.org

Motions in a video primarily consist of camera motion, induced by camera movement, and
object motion, resulting from object movement. Accurate control of both camera and object …

被引用次数：107 相关文章所有 2 个版本

[PDF] neurips.cc

Autodecoding latent 3d diffusion models

E Ntavelis, A Siarohin, K Olszewski… - Advances in …, 2023 - proceedings.neurips.cc

Diffusion-based methods have shown impressive visual results in the text-to-image domain.
They first learn a latent space using an autoencoder, then run a denoising process on the …

被引用次数：36 相关文章所有 5 个版本

[PDF] thecvf.com

Make pixels dance: High-dynamic video generation

Y Zeng, G Wei, J Zheng, J Zou, Y Wei… - Proceedings of the …, 2024 - openaccess.thecvf.com

Creating high-dynamic videos such as motion-rich actions and sophisticated visual effects
poses a significant challenge in the field of artificial intelligence. Unfortunately current state …

被引用次数：73 相关文章所有 4 个版本

[PDF] openreview.net

Seine: Short-to-long video diffusion model for generative transition and prediction

X Chen, Y Wang, L Zhang, S Zhuang, X Ma… - The Twelfth …, 2023 - openreview.net

Recently video generation has achieved substantial progress with realistic results.
Nevertheless, existing AI-generated videos are usually very short clips (" shot-level'') …

被引用次数：89 相关文章所有 3 个版本

[PDF] arxiv.org

Diffusion model-based image editing: A survey

Y Huang, J Huang, Y Liu, M Yan, J Lv, J Liu… - arXiv preprint arXiv …, 2024 - arxiv.org

Denoising diffusion models have emerged as a powerful tool for various image generation
and editing tasks, facilitating the synthesis of visual content in an unconditional or input …

被引用次数：63 相关文章所有 2 个版本

[PDF] arxiv.org

Livephoto: Real image animation with text-guided motion control

X Chen, Z Liu, M Chen, Y Feng, Y Liu, Y Shen… - … on Computer Vision, 2025 - Springer

Despite the recent progress in text-to-video generation, existing studies usually overlook the
issue that only spatial contents but not temporal motions in synthesized videos are under the …

被引用次数：25 相关文章所有 2 个版本

[PDF] thecvf.com

Snap video: Scaled spatiotemporal transformers for text-to-video synthesis

W Menapace, A Siarohin… - Proceedings of the …, 2024 - openaccess.thecvf.com

Contemporary models for generating images show remarkable quality and versatility.
Swayed by these advantages the research community repurposes them to generate videos …

被引用次数：38 相关文章所有 3 个版本

[PDF] thecvf.com

Vlogger: Make your dream a vlog

S Zhuang, K Li, X Chen, Y Wang… - Proceedings of the …, 2024 - openaccess.thecvf.com

In this work we present Vlogger a generic AI system for generating a minute-level video blog
(ie vlog) of user descriptions. Different from short videos with a few seconds vlog often …

被引用次数：25 相关文章所有 3 个版本

高级搜索

QQ 群