相关文章- 学术资源搜索

Simda: Simple diffusion adapter for efficient video generation

Z Xing, Q Dai, H Hu, Z Wu… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

The recent wave of AI-generated content has witnessed the great development and success
of Text-to-Image (T2I) technologies. By contrast Text-to-Video (T2V) still falls short of …

被引用次数：33 相关文章所有 3 个版本

[PDF] arxiv.org

Lavie: High-quality video generation with cascaded latent diffusion models

Y Wang, X Chen, X Ma, S Zhou, Z Huang… - arXiv preprint arXiv …, 2023 - arxiv.org

This work aims to learn a high-quality text-to-video (T2V) generative model by leveraging a
pre-trained text-to-image (T2I) model as a basis. It is a highly desirable yet challenging task …

被引用次数：87 相关文章所有 3 个版本

[PDF] neurips.cc

Video diffusion models

J Ho, T Salimans, A Gritsenko… - Advances in …, 2022 - proceedings.neurips.cc

Generating temporally coherent high fidelity video is an important milestone in generative
modeling research. We make progress towards this milestone by proposing a diffusion …

被引用次数：884 相关文章所有 8 个版本

[PDF] thecvf.com

Efficient video prediction via sparsely conditioned flow matching

A Davtyan, S Sameni, P Favaro - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

We introduce a novel generative model for video prediction based on latent flow matching,
an efficient alternative to diffusion-based models. In contrast to prior work, we keep the high …

被引用次数：14 相关文章所有 6 个版本

Magicvideo: Efficient video generation with latent diffusion models

D Zhou, W Wang, H Yan, W Lv, Y Zhu… - arXiv preprint arXiv …, 2022 - arxiv.org

We present an efficient text-to-video generation framework based on latent diffusion models,
termed MagicVideo. MagicVideo can generate smooth video clips that are concordant with …

被引用次数：212 相关文章所有 2 个版本

[PDF] thecvf.com

Magvit: Masked generative video transformer

L Yu, Y Cheng, K Sohn, J Lezama… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract We introduce the MAsked Generative VIdeo Transformer, MAGVIT, to tackle various
video synthesis tasks with a single model. We introduce a 3D tokenizer to quantize a video …

被引用次数：95 相关文章所有 8 个版本

[PDF] thecvf.com

Video probabilistic diffusion models in projected latent space

S Yu, K Sohn, S Kim, J Shin - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com

Despite the remarkable progress in deep generative models, synthesizing high-resolution
and temporally coherent videos still remains a challenge due to their high-dimensionality …

被引用次数：103 相关文章所有 6 个版本

[PDF] arxiv.org

Latent-shift: Latent diffusion with temporal shift for efficient text-to-video generation

J An, S Zhang, H Yang, S Gupta, JB Huang… - arXiv preprint arXiv …, 2023 - arxiv.org

We propose Latent-Shift--an efficient text-to-video generation method based on a pretrained
text-to-image generation model that consists of an autoencoder and a U-Net diffusion model …

被引用次数：65 相关文章所有 2 个版本

[PDF] arxiv.org

Generating videos with dynamics-aware implicit generative adversarial networks

S Yu, J Tack, S Mo, H Kim, J Kim, JW Ha… - arXiv preprint arXiv …, 2022 - arxiv.org

In the deep learning era, long video generation of high-quality still remains challenging due
to the spatio-temporal complexity and continuity of videos. Existing prior works have …

被引用次数：178 相关文章所有 6 个版本

[PDF] thecvf.com

Fairy: Fast parallelized instruction-guided video-to-video synthesis

B Wu, CY Chuang, X Wang, Y Jia… - Proceedings of the …, 2024 - openaccess.thecvf.com

In this paper we introduce Fairy a minimalist yet robust adaptation of image-editing diffusion
models enhancing them for video editing applications. Our approach centers on the concept …

被引用次数：10 相关文章所有 3 个版本

高级搜索

QQ 群

Simda: Simple diffusion adapter for efficient video generation

Lavie: High-quality video generation with cascaded latent diffusion models

Video diffusion models

Efficient video prediction via sparsely conditioned flow matching

Magicvideo: Efficient video generation with latent diffusion models

Magvit: Masked generative video transformer

Video probabilistic diffusion models in projected latent space

Latent-shift: Latent diffusion with temporal shift for efficient text-to-video generation

Generating videos with dynamics-aware implicit generative adversarial networks

Fairy: Fast parallelized instruction-guided video-to-video synthesis

相关搜索

引用