Freenoise: Tuning-free longer video diffusion via noise rescheduling

T Wu, C Si, Y Jiang, Z Huang, Z Liu - European Conference on Computer …, 2025 - Springer

Though diffusion-based video generation has witnessed rapid progress, the inference
results of existing models still exhibit unsatisfactory temporal consistency and unnatural …

被引用次数：37 相关文章所有 3 个版本

[PDF] thecvf.com

Vlogger: Make your dream a vlog

S Zhuang, K Li, X Chen, Y Wang… - Proceedings of the …, 2024 - openaccess.thecvf.com

In this work we present Vlogger a generic AI system for generating a minute-level video blog
(ie vlog) of user descriptions. Different from short videos with a few seconds vlog often …

被引用次数：25 相关文章所有 3 个版本

[PDF] arxiv.org

Loong: Generating minute-level long videos with autoregressive language models

Y Wang, T Xiong, D Zhou, Z Lin, Y Zhao, B Kang… - arXiv preprint arXiv …, 2024 - arxiv.org

It is desirable but challenging to generate content-rich long videos in the scale of minutes.
Autoregressive large language models (LLMs) have achieved great success in generating …

被引用次数：18 相关文章所有 3 个版本

[PDF] arxiv.org

Consisti2v: Enhancing visual consistency for image-to-video generation

W Ren, H Yang, G Zhang, C Wei, X Du… - arXiv preprint arXiv …, 2024 - arxiv.org

Image-to-video (I2V) generation aims to use the initial frame (alongside a text prompt) to
create a video sequence. A grand challenge in I2V generation is to maintain visual …

被引用次数：25 相关文章所有 2 个版本

[PDF] thecvf.com

ART-V: Auto-Regressive Text-to-Video Generation with Diffusion Models

W Weng, R Feng, Y Wang, Q Dai… - Proceedings of the …, 2024 - openaccess.thecvf.com

We present ART-V an efficient framework for auto-regressive video generation with diffusion
models. Unlike existing methods that generate entire videos in one-shot ART-V generates a …

被引用次数：22 相关文章所有 4 个版本

[PDF] arxiv.org

Moviedreamer: Hierarchical generation for coherent long visual sequence

C Zhao, M Liu, W Wang, W Chen, F Wang… - arXiv preprint arXiv …, 2024 - arxiv.org

Recent advancements in video generation have primarily leveraged diffusion models for
short-duration content. However, these approaches often fall short in modeling complex …

被引用次数：12 相关文章所有 4 个版本

[PDF] arxiv.org

Freelong: Training-free long video generation with spectralblend temporal attention

Y Lu, Y Liang, L Zhu, Y Yang - arXiv preprint arXiv:2407.19918, 2024 - arxiv.org

Video diffusion models have made substantial progress in various video generation
applications. However, training models for long video generation tasks require significant …

被引用次数：12 相关文章所有 4 个版本

[PDF] arxiv.org

Streamingt2v: Consistent, dynamic, and extendable long video generation from text

R Henschel, L Khachatryan, D Hayrapetyan… - arXiv preprint arXiv …, 2024 - arxiv.org

Text-to-video diffusion models enable the generation of high-quality videos that follow text
instructions, making it easy to create diverse and individual content. However, existing …

被引用次数：50 相关文章所有 2 个版本

[PDF] arxiv.org

Evaluation of text-to-video generation models: A dynamics perspective

M Liao, H Lu, X Zhang, F Wan, T Wang, Y Zhao… - arXiv preprint arXiv …, 2024 - arxiv.org

Comprehensive and constructive evaluation protocols play an important role in the
development of sophisticated text-to-video (T2V) generation models. Existing evaluation …

被引用次数：11 相关文章所有 4 个版本

[PDF] arxiv.org

Freetraj: Tuning-free trajectory control in video diffusion models

H Qiu, Z Chen, Z Wang, Y He, M Xia, Z Liu - arXiv preprint arXiv …, 2024 - arxiv.org

Diffusion model has demonstrated remarkable capability in video generation, which further
sparks interest in introducing trajectory control into the generation process. While existing …

被引用次数：8 相关文章所有 3 个版本

高级搜索

QQ 群