In a world with rapidly growing levels of automation, robotics is playing an increasingly significant role in every aspect of human endeavour. In particular, many types of mobile …
Abstract Latent Diffusion Models (LDMs) enable high-quality image synthesis while avoiding excessive compute demands by training a diffusion model in a compressed lower …
We present Stable Video Diffusion-a latent video diffusion model for high-resolution, state-of- the-art text-to-video and image-to-video generation. Recently, latent diffusion models trained …
Video prediction is a challenging task. The quality of video frames from current state-of-the- art (SOTA) generative models tends to be poor and generalization beyond the training data …
Z Gao, C Tan, L Wu, SZ Li - … of the IEEE/CVF conference on …, 2022 - openaccess.thecvf.com
Abstract From CNN, RNN, to ViT, we have witnessed remarkable advancements in video prediction, incorporating auxiliary inputs, elaborate neural architectures, and sophisticated …
We introduce GAUDI, a generative model capable of capturing the distribution of complex and realistic 3D scenes that can be rendered immersively from a moving camera. We tackle …
Z Xing, Q Dai, H Hu, Z Wu… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
The recent wave of AI-generated content has witnessed the great development and success of Text-to-Image (T2I) technologies. By contrast Text-to-Video (T2V) still falls short of …
Denoising diffusion probabilistic models are a promising new class of generative models that mark a milestone in high-quality image generation. This paper showcases their ability to …
The predictive learning of spatiotemporal sequences aims to generate future images by learning from the historical context, where the visual dynamics are believed to have modular …