H Wang, N Liu, Y Zhang, D Feng, F Huang, D Li… - Frontiers of Information …, 2020 - Springer
Deep reinforcement learning (RL) has become one of the most popular topics in artificial intelligence research. It has been widely used in various fields, such as end-to-end control …
Generating temporally coherent high fidelity video is an important milestone in generative modeling research. We make progress towards this milestone by proposing a diffusion …
Video prediction is a challenging task. The quality of video frames from current state-of-the- art (SOTA) generative models tends to be poor and generalization beyond the training data …
Abstract We introduce the MAsked Generative VIdeo Transformer, MAGVIT, to tackle various video synthesis tasks with a single model. We introduce a 3D tokenizer to quantize a video …
This paper presents a unified multimodal pre-trained model called NÜWA that can generate new or manipulate existing visual data (ie, images and videos) for various visual synthesis …
Denoising diffusion probabilistic models are a promising new class of generative models that mark a milestone in high-quality image generation. This paper showcases their ability to …
Intelligent agents need to generalize from past experience to achieve goals in complex environments. World models facilitate such generalization and allow learning behaviors …
We present VideoGPT: a conceptually simple architecture for scaling likelihood based generative modeling to natural videos. VideoGPT uses VQ-VAE that learns downsampled …
The predictive learning of spatiotemporal sequences aims to generate future images by learning from the historical context, where the visual dynamics are believed to have modular …