Seer: Language instructed video prediction with latent diffusion models

Z Xing, Q Feng, H Chen, Q Dai, H Hu, H Xu… - ACM Computing …, 2024 - dl.acm.org

The recent wave of AI-generated content (AIGC) has witnessed substantial success in
computer vision, with the diffusion model playing a crucial role in this achievement. Due to …

被引用次数：87 相关文章所有 3 个版本

[PDF] arxiv.org

Sora: A review on background, technology, limitations, and opportunities of large vision models

Y Liu, K Zhang, Y Li, Z Yan, C Gao, R Chen… - arXiv preprint arXiv …, 2024 - arxiv.org

Sora is a text-to-video generative AI model, released by OpenAI in February 2024. The
model is trained to generate videos of realistic or imaginative scenes from text instructions …

被引用次数：217 相关文章所有 2 个版本

[PDF] arxiv.org

Dynamicrafter: Animating open-domain images with video diffusion priors

J Xing, M Xia, Y Zhang, H Chen, W Yu, H Liu… - … on Computer Vision, 2024 - Springer

Animating a still image offers an engaging visual experience. Traditional image animation
techniques mainly focus on animating natural scenes with stochastic dynamics (eg clouds …

被引用次数：158 相关文章所有 2 个版本

[PDF] thecvf.com

Vbench: Comprehensive benchmark suite for video generative models

Z Huang, Y He, J Yu, F Zhang, C Si… - Proceedings of the …, 2024 - openaccess.thecvf.com

Video generation has witnessed significant advancements yet evaluating these models
remains a challenge. A comprehensive evaluation benchmark for video generation is …

被引用次数：196 相关文章所有 4 个版本

[PDF] acm.org

Tooncrafter: Generative cartoon interpolation

J Xing, H Liu, M Xia, Y Zhang, X Wang, Y Shan… - ACM Transactions on …, 2024 - dl.acm.org

We introduce ToonCrafter, a novel approach that transcends traditional correspondence-
based cartoon video interpolation, paving the way for generative interpolation. Traditional …

被引用次数：23 相关文章所有 2 个版本

[PDF] thecvf.com

Make pixels dance: High-dynamic video generation

Y Zeng, G Wei, J Zheng, J Zou, Y Wei… - Proceedings of the …, 2024 - openaccess.thecvf.com

Creating high-dynamic videos such as motion-rich actions and sophisticated visual effects
poses a significant challenge in the field of artificial intelligence. Unfortunately current state …

被引用次数：76 相关文章所有 4 个版本

[PDF] ieee.org

Make-your-video: Customized video generation using textual and structural guidance

J Xing, M Xia, Y Liu, Y Zhang, Y Zhang… - … on Visualization and …, 2024 - ieeexplore.ieee.org

Creating a vivid video from the event or scenario in our imagination is a truly fascinating
experience. Recent advancements in text-to-video synthesis have unveiled the potential to …

被引用次数：57 相关文章所有 6 个版本

[PDF] arxiv.org

Generative semantic communication: Diffusion models beyond bit recovery

E Grassucci, S Barbarossa, D Comminiello - arXiv preprint arXiv …, 2023 - arxiv.org

Semantic communication is expected to be one of the cores of next-generation AI-based
communications. One of the possibilities offered by semantic communication is the capability …

被引用次数：55 相关文章所有 3 个版本

[PDF] springer.com

AniClipart: Clipart animation with text-to-video priors

R Wu, W Su, K Ma, J Liao - International Journal of Computer Vision, 2024 - Springer

Clipart, a pre-made graphic art form, offers a convenient and efficient way of illustrating
visual content. Traditional workflows to convert static clipart images into motion sequences …

被引用次数：5 相关文章所有 2 个版本

[PDF] openreview.net

Foundation reinforcement learning: towards embodied generalist agents with foundation prior assistance

W Ye, Y Zhang, M Wang, S Wang, X Gu, P Abbeel… - 2023 - openreview.net

Recently, people have shown that large-scale pre-training from diverse internet-scale data is
the key to building a generalist model, as witnessed in the natural language processing …

被引用次数：9 相关文章所有 2 个版本

高级搜索

QQ 群