Lumiere: A space-time diffusion model for video generation

Y Liu, K Zhang, Y Li, Z Yan, C Gao, R Chen… - arXiv preprint arXiv …, 2024 - arxiv.org

Sora is a text-to-video generative AI model, released by OpenAI in February 2024. The
model is trained to generate videos of realistic or imaginative scenes from text instructions …

被引用次数：72 相关文章所有 2 个版本

[PDF] arxiv.org

State of the art on diffusion models for visual computing

R Po, W Yifan, V Golyanik, K Aberman… - Computer Graphics …, 2024 - Wiley Online Library

The field of visual computing is rapidly advancing due to the emergence of generative
artificial intelligence (AI), which unlocks unprecedented capabilities for the generation …

被引用次数：54 相关文章所有 12 个版本

[PDF] thecvf.com

Vbench: Comprehensive benchmark suite for video generative models

Z Huang, Y He, J Yu, F Zhang, C Si… - Proceedings of the …, 2024 - openaccess.thecvf.com

Video generation has witnessed significant advancements yet evaluating these models
remains a challenge. A comprehensive evaluation benchmark for video generation is …

被引用次数：51 相关文章所有 4 个版本

[PDF] arxiv.org

Videopoet: A large language model for zero-shot video generation

D Kondratyuk, L Yu, X Gu, J Lezama, J Huang… - arXiv preprint arXiv …, 2023 - arxiv.org

We present VideoPoet, a language model capable of synthesizing high-quality video, with
matching audio, from a large variety of conditioning signals. VideoPoet employs a decoder …

被引用次数：62 相关文章所有 5 个版本

[PDF] ieee.org

When does Sora show: The beginning of TAO to imaginative intelligence and scenarios engineering

FY Wang, Q Miao, L Li, Q Ni, X Li, J Li… - IEEE/CAA Journal of …, 2024 - ieeexplore.ieee.org

During our discussion at workshops for writing “What Does ChatGPT Say: The DAO from
Algorithmic Intelligence to Linguistic Intelligence”[1], we had expected the next milestone for …

被引用次数：30 相关文章所有 4 个版本

Sora for senarios engineering of intelligent vehicles: V&V, C&C, and beyonds

X Li, Q Miao, L Li, Y Hou, Q Ni, L Fan… - IEEE Transactions …, 2024 - ieeexplore.ieee.org

The advent of Scenarios Engineering (SE) paves the way to a new era of intelligent vehicles
(IVs), driven by Artificial Intelligence (AI)-enabled strategies. It aims at shaping the IVs to be …

被引用次数：17 相关文章

[PDF] arxiv.org

V3d: Video diffusion models are effective 3d generators

Z Chen, Y Wang, F Wang, Z Wang, H Liu - arXiv preprint arXiv:2403.06738, 2024 - arxiv.org

Automatic 3D generation has recently attracted widespread attention. Recent methods have
greatly accelerated the generation speed, but usually produce less-detailed objects due to …

被引用次数：14 相关文章所有 2 个版本

[PDF] arxiv.org

Anyv2v: A plug-and-play framework for any video-to-video editing tasks

M Ku, C Wei, W Ren, H Yang, W Chen - arXiv preprint arXiv:2403.14468, 2024 - arxiv.org

Video-to-video editing involves editing a source video along with additional control (such as
text prompts, subjects, or styles) to generate a new video that aligns with the source video …

被引用次数：7 相关文章所有 2 个版本

[PDF] arxiv.org

Cameractrl: Enabling camera control for text-to-video generation

H He, Y Xu, Y Guo, G Wetzstein, B Dai, H Li… - arXiv preprint arXiv …, 2024 - arxiv.org

Controllability plays a crucial role in video generation since it allows users to create desired
content. However, existing models largely overlooked the precise control of camera pose …

被引用次数：12 相关文章所有 2 个版本

[PDF] arxiv.org

Vasa-1: Lifelike audio-driven talking faces generated in real time

S Xu, G Chen, YX Guo, J Yang, C Li, Z Zang… - arXiv preprint arXiv …, 2024 - arxiv.org

We introduce VASA, a framework for generating lifelike talking faces with appealing visual
affective skills (VAS) given a single static image and a speech audio clip. Our premiere …

被引用次数：11 相关文章所有 2 个版本

高级搜索

QQ 群