A survey on video diffusion models

Z Xing, Q Feng, H Chen, Q Dai, H Hu, H Xu… - ACM Computing …, 2024 - dl.acm.org
The recent wave of AI-generated content (AIGC) has witnessed substantial success in
computer vision, with the diffusion model playing a crucial role in this achievement. Due to …

Sora: A review on background, technology, limitations, and opportunities of large vision models

Y Liu, K Zhang, Y Li, Z Yan, C Gao, R Chen… - arXiv preprint arXiv …, 2024 - arxiv.org
Sora is a text-to-video generative AI model, released by OpenAI in February 2024. The
model is trained to generate videos of realistic or imaginative scenes from text instructions …

Dynamicrafter: Animating open-domain images with video diffusion priors

J Xing, M Xia, Y Zhang, H Chen, W Yu, H Liu… - … on Computer Vision, 2024 - Springer
Animating a still image offers an engaging visual experience. Traditional image animation
techniques mainly focus on animating natural scenes with stochastic dynamics (eg clouds …

Vbench: Comprehensive benchmark suite for video generative models

Z Huang, Y He, J Yu, F Zhang, C Si… - Proceedings of the …, 2024 - openaccess.thecvf.com
Video generation has witnessed significant advancements yet evaluating these models
remains a challenge. A comprehensive evaluation benchmark for video generation is …

Tooncrafter: Generative cartoon interpolation

J Xing, H Liu, M Xia, Y Zhang, X Wang, Y Shan… - ACM Transactions on …, 2024 - dl.acm.org
We introduce ToonCrafter, a novel approach that transcends traditional correspondence-
based cartoon video interpolation, paving the way for generative interpolation. Traditional …

Make pixels dance: High-dynamic video generation

Y Zeng, G Wei, J Zheng, J Zou, Y Wei… - Proceedings of the …, 2024 - openaccess.thecvf.com
Creating high-dynamic videos such as motion-rich actions and sophisticated visual effects
poses a significant challenge in the field of artificial intelligence. Unfortunately current state …

Make-your-video: Customized video generation using textual and structural guidance

J Xing, M Xia, Y Liu, Y Zhang, Y Zhang… - … on Visualization and …, 2024 - ieeexplore.ieee.org
Creating a vivid video from the event or scenario in our imagination is a truly fascinating
experience. Recent advancements in text-to-video synthesis have unveiled the potential to …

Generative semantic communication: Diffusion models beyond bit recovery

E Grassucci, S Barbarossa, D Comminiello - arXiv preprint arXiv …, 2023 - arxiv.org
Semantic communication is expected to be one of the cores of next-generation AI-based
communications. One of the possibilities offered by semantic communication is the capability …

AniClipart: Clipart animation with text-to-video priors

R Wu, W Su, K Ma, J Liao - International Journal of Computer Vision, 2024 - Springer
Clipart, a pre-made graphic art form, offers a convenient and efficient way of illustrating
visual content. Traditional workflows to convert static clipart images into motion sequences …

Foundation reinforcement learning: towards embodied generalist agents with foundation prior assistance

W Ye, Y Zhang, M Wang, S Wang, X Gu, P Abbeel… - 2023 - openreview.net
Recently, people have shown that large-scale pre-training from diverse internet-scale data is
the key to building a generalist model, as witnessed in the natural language processing …