Self-supervised learning in remote sensing: A review

Y Wang, CM Albrecht, NAA Braham… - IEEE Geoscience and …, 2022 - ieeexplore.ieee.org
In deep learning research, self-supervised learning (SSL) has received great attention,
triggering interest within both the computer vision and remote sensing communities. While …

A review on generative adversarial networks: Algorithms, theory, and applications

J Gui, Z Sun, Y Wen, D Tao, J Ye - IEEE transactions on …, 2021 - ieeexplore.ieee.org
Generative adversarial networks (GANs) have recently become a hot research topic;
however, they have been studied since 2014, and a large number of algorithms have been …

Align your latents: High-resolution video synthesis with latent diffusion models

A Blattmann, R Rombach, H Ling… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract Latent Diffusion Models (LDMs) enable high-quality image synthesis while avoiding
excessive compute demands by training a diffusion model in a compressed lower …

Ego4d: Around the world in 3,000 hours of egocentric video

K Grauman, A Westbury, E Byrne… - Proceedings of the …, 2022 - openaccess.thecvf.com
We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite. It
offers 3,670 hours of daily-life activity video spanning hundreds of scenarios (household …

Mcvd-masked conditional video diffusion for prediction, generation, and interpolation

V Voleti, A Jolicoeur-Martineau… - Advances in neural …, 2022 - proceedings.neurips.cc
Video prediction is a challenging task. The quality of video frames from current state-of-the-
art (SOTA) generative models tends to be poor and generalization beyond the training data …

Simvp: Simpler yet better video prediction

Z Gao, C Tan, L Wu, SZ Li - … of the IEEE/CVF conference on …, 2022 - openaccess.thecvf.com
Abstract From CNN, RNN, to ViT, we have witnessed remarkable advancements in video
prediction, incorporating auxiliary inputs, elaborate neural architectures, and sophisticated …

Simda: Simple diffusion adapter for efficient video generation

Z Xing, Q Dai, H Hu, Z Wu… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
The recent wave of AI-generated content has witnessed the great development and success
of Text-to-Image (T2I) technologies. By contrast Text-to-Video (T2V) still falls short of …

Affordances from human videos as a versatile representation for robotics

S Bahl, R Mendonca, L Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com
Building a robot that can understand and learn to interact by watching humans has inspired
several vision problems. However, despite some successful results on static datasets, it …

Bottom-up human pose estimation via disentangled keypoint regression

Z Geng, K Sun, B Xiao, Z Zhang… - Proceedings of the …, 2021 - openaccess.thecvf.com
In this paper, we are interested in the bottom-up paradigm of estimating human poses from
an image. We study the dense keypoint regression framework that is previously inferior to …

[HTML][HTML] Diffusion probabilistic modeling for video generation

R Yang, P Srivastava, S Mandt - Entropy, 2023 - mdpi.com
Denoising diffusion probabilistic models are a promising new class of generative models
that mark a milestone in high-quality image generation. This paper showcases their ability to …