Generative adversarial networks (GANs) have recently become a hot research topic; however, they have been studied since 2014, and a large number of algorithms have been …
Abstract Latent Diffusion Models (LDMs) enable high-quality image synthesis while avoiding excessive compute demands by training a diffusion model in a compressed lower …
We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite. It offers 3,670 hours of daily-life activity video spanning hundreds of scenarios (household …
Video prediction is a challenging task. The quality of video frames from current state-of-the- art (SOTA) generative models tends to be poor and generalization beyond the training data …
Z Gao, C Tan, L Wu, SZ Li - … of the IEEE/CVF conference on …, 2022 - openaccess.thecvf.com
Abstract From CNN, RNN, to ViT, we have witnessed remarkable advancements in video prediction, incorporating auxiliary inputs, elaborate neural architectures, and sophisticated …
Z Xing, Q Dai, H Hu, Z Wu… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
The recent wave of AI-generated content has witnessed the great development and success of Text-to-Image (T2I) technologies. By contrast Text-to-Video (T2V) still falls short of …
Building a robot that can understand and learn to interact by watching humans has inspired several vision problems. However, despite some successful results on static datasets, it …
Z Geng, K Sun, B Xiao, Z Zhang… - Proceedings of the …, 2021 - openaccess.thecvf.com
In this paper, we are interested in the bottom-up paradigm of estimating human poses from an image. We study the dense keypoint regression framework that is previously inferior to …
Denoising diffusion probabilistic models are a promising new class of generative models that mark a milestone in high-quality image generation. This paper showcases their ability to …