Stochastic variational video prediction

Deep generative modelling: A comparative review of vaes, gans, normalizing flows, energy-based and autoregressive models

S Bond-Taylor, A Leach, Y Long… - IEEE transactions on …, 2021 - ieeexplore.ieee.org

Deep generative models are a class of techniques that train deep neural networks to model
the distribution of training samples. Research has fragmented into various interconnected …

被引用次数：473 相关文章所有 12 个版本

[PDF] arxiv.org

Dynamical variational autoencoders: A comprehensive review

L Girin, S Leglaive, X Bie, J Diard, T Hueber… - arXiv preprint arXiv …, 2020 - arxiv.org

Variational autoencoders (VAEs) are powerful deep generative models widely used to
represent high-dimensional complex data through a low-dimensional latent space learned …

被引用次数：228 相关文章所有 54 个版本

[PDF] thecvf.com

Align your latents: High-resolution video synthesis with latent diffusion models

A Blattmann, R Rombach, H Ling… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract Latent Diffusion Models (LDMs) enable high-quality image synthesis while avoiding
excessive compute demands by training a diffusion model in a compressed lower …

被引用次数：527 相关文章所有 6 个版本

[PDF] arxiv.org

Imagen video: High definition video generation with diffusion models

J Ho, W Chan, C Saharia, J Whang, R Gao… - arXiv preprint arXiv …, 2022 - arxiv.org

We present Imagen Video, a text-conditional video generation system based on a cascade
of video diffusion models. Given a text prompt, Imagen Video generates high definition …

被引用次数：953 相关文章所有 4 个版本

[PDF] arxiv.org

Stable video diffusion: Scaling latent video diffusion models to large datasets

A Blattmann, T Dockhorn, S Kulal… - arXiv preprint arXiv …, 2023 - arxiv.org

We present Stable Video Diffusion-a latent video diffusion model for high-resolution, state-of-
the-art text-to-video and image-to-video generation. Recently, latent diffusion models trained …

被引用次数：248 相关文章所有 2 个版本

[PDF] openreview.net

Phenaki: Variable length video generation from open domain textual descriptions

R Villegas, M Babaeizadeh, PJ Kindermans… - International …, 2022 - openreview.net

We present Phenaki, a model capable of realistic video synthesis given a sequence of
textual prompts. Generating videos from text is particularly challenging due to the …

被引用次数：264 相关文章所有 5 个版本

[PDF] neurips.cc

Video diffusion models

J Ho, T Salimans, A Gritsenko… - Advances in …, 2022 - proceedings.neurips.cc

Generating temporally coherent high fidelity video is an important milestone in generative
modeling research. We make progress towards this milestone by proposing a diffusion …

被引用次数：937 相关文章所有 8 个版本

[PDF] neurips.cc

Mcvd-masked conditional video diffusion for prediction, generation, and interpolation

V Voleti, A Jolicoeur-Martineau… - Advances in neural …, 2022 - proceedings.neurips.cc

Video prediction is a challenging task. The quality of video frames from current state-of-the-
art (SOTA) generative models tends to be poor and generalization beyond the training data …

被引用次数：188 相关文章所有 9 个版本

[PDF] neurips.cc

Flexible diffusion modeling of long videos

W Harvey, S Naderiparizi, V Masrani… - Advances in …, 2022 - proceedings.neurips.cc

We present a framework for video modeling based on denoising diffusion probabilistic
models that produces long-duration video completions in a variety of realistic environments …

被引用次数：193 相关文章所有 8 个版本

[PDF] thecvf.com

Simvp: Simpler yet better video prediction

Z Gao, C Tan, L Wu, SZ Li - … of the IEEE/CVF conference on …, 2022 - openaccess.thecvf.com

Abstract From CNN, RNN, to ViT, we have witnessed remarkable advancements in video
prediction, incorporating auxiliary inputs, elaborate neural architectures, and sophisticated …

被引用次数：168 相关文章所有 6 个版本

高级搜索

QQ 群