Transformation-based adversarial video prediction on large-scale data

S Oprea, P Martinez-Gonzalez… - … on Pattern Analysis …, 2020 - ieeexplore.ieee.org

The ability to predict, anticipate and reason about future outcomes is a key component of
intelligent decision-making systems. In light of the success of deep learning in computer …

被引用次数：272 相关文章所有 14 个版本

Teleoperation methods and enhancement techniques for mobile robots: A comprehensive survey

MD Moniruzzaman, A Rassau, D Chai… - Robotics and Autonomous …, 2022 - Elsevier

In a world with rapidly growing levels of automation, robotics is playing an increasingly
significant role in every aspect of human endeavour. In particular, many types of mobile …

被引用次数：77 相关文章所有 3 个版本

[PDF] thecvf.com

Align your latents: High-resolution video synthesis with latent diffusion models

A Blattmann, R Rombach, H Ling… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract Latent Diffusion Models (LDMs) enable high-quality image synthesis while avoiding
excessive compute demands by training a diffusion model in a compressed lower …

被引用次数：538 相关文章所有 6 个版本

[PDF] openreview.net

Phenaki: Variable length video generation from open domain textual descriptions

R Villegas, M Babaeizadeh, PJ Kindermans… - International …, 2022 - openreview.net

We present Phenaki, a model capable of realistic video synthesis given a sequence of
textual prompts. Generating videos from text is particularly challenging due to the …

被引用次数：274 相关文章所有 5 个版本

[PDF] neurips.cc

Video diffusion models

J Ho, T Salimans, A Gritsenko… - Advances in …, 2022 - proceedings.neurips.cc

Generating temporally coherent high fidelity video is an important milestone in generative
modeling research. We make progress towards this milestone by proposing a diffusion …

被引用次数：963 相关文章所有 8 个版本

[PDF] neurips.cc

Mcvd-masked conditional video diffusion for prediction, generation, and interpolation

V Voleti, A Jolicoeur-Martineau… - Advances in neural …, 2022 - proceedings.neurips.cc

Video prediction is a challenging task. The quality of video frames from current state-of-the-
art (SOTA) generative models tends to be poor and generalization beyond the training data …

被引用次数：192 相关文章所有 9 个版本

[PDF] thecvf.com

Magvit: Masked generative video transformer

L Yu, Y Cheng, K Sohn, J Lezama… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract We introduce the MAsked Generative VIdeo Transformer, MAGVIT, to tackle various
video synthesis tasks with a single model. We introduce a 3D tokenizer to quantize a video …

被引用次数：101 相关文章所有 8 个版本

[PDF] thecvf.com

Simvp: Simpler yet better video prediction

Z Gao, C Tan, L Wu, SZ Li - … of the IEEE/CVF conference on …, 2022 - openaccess.thecvf.com

Abstract From CNN, RNN, to ViT, we have witnessed remarkable advancements in video
prediction, incorporating auxiliary inputs, elaborate neural architectures, and sophisticated …

被引用次数：175 相关文章所有 6 个版本

[PDF] arxiv.org

Nüwa: Visual synthesis pre-training for neural visual world creation

C Wu, J Liang, L Ji, F Yang, Y Fang, D Jiang… - European conference on …, 2022 - Springer

This paper presents a unified multimodal pre-trained model called NÜWA that can generate
new or manipulate existing visual data (ie, images and videos) for various visual synthesis …

被引用次数：276 相关文章所有 6 个版本

[PDF] thecvf.com

Simda: Simple diffusion adapter for efficient video generation

Z Xing, Q Dai, H Hu, Z Wu… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

The recent wave of AI-generated content has witnessed the great development and success
of Text-to-Image (T2I) technologies. By contrast Text-to-Video (T2V) still falls short of …

被引用次数：35 相关文章所有 3 个版本

高级搜索

QQ 群