Computer vision for autonomous vehicles: Problems, datasets and state of the art

J Janai, F Güney, A Behl, A Geiger - Foundations and Trends® …, 2020 - nowpublishers.com
Recent years have witnessed enormous progress in AI-related fields such as computer
vision, machine learning, and autonomous vehicles. As with any rapidly growing field, it …

[HTML][HTML] Advances in solar forecasting: Computer vision with deep learning

Q Paletta, G Terrén-Serrano, Y Nie, B Li… - Advances in Applied …, 2023 - Elsevier
Renewable energy forecasting is crucial for integrating variable energy sources into the grid.
It allows power systems to address the intermittency of the energy supply at different …

Videocomposer: Compositional video synthesis with motion controllability

X Wang, H Yuan, S Zhang, D Chen… - Advances in …, 2024 - proceedings.neurips.cc
The pursuit of controllability as a higher standard of visual content creation has yielded
remarkable progress in customizable image synthesis. However, achieving controllable …

Learning to estimate hidden motions with global motion aggregation

S Jiang, D Campbell, Y Lu, H Li… - Proceedings of the …, 2021 - openaccess.thecvf.com
Occlusions pose a significant challenge to optical flow algorithms that rely on local
evidences. We consider an occluded point to be one that is imaged in the first frame but not …

Self-supervised co-training for video representation learning

T Han, W Xie, A Zisserman - Advances in neural information …, 2020 - proceedings.neurips.cc
The objective of this paper is visual-only self-supervised video representation learning. We
make the following contributions:(i) we investigate the benefit of adding semantic-class …

Raft: Recurrent all-pairs field transforms for optical flow

Z Teed, J Deng - Computer Vision–ECCV 2020: 16th European …, 2020 - Springer
Abstract We introduce Recurrent All-Pairs Field Transforms (RAFT), a new deep network
architecture for optical flow. RAFT extracts per-pixel features, builds multi-scale 4D …

Tea: Temporal excitation and aggregation for action recognition

Y Li, B Ji, X Shi, J Zhang, B Kang… - Proceedings of the …, 2020 - openaccess.thecvf.com
Temporal modeling is key for action recognition in videos. It normally considers both short-
range motions and long-range aggregations. In this paper, we propose a Temporal …

Contrastive multiview coding

Y Tian, D Krishnan, P Isola - Computer Vision–ECCV 2020: 16th European …, 2020 - Springer
Humans view the world through many sensory channels, eg, the long-wavelength light
channel, viewed by the left eye, or the high-frequency vibrations channel, heard by the right …

Rescaling egocentric vision: Collection, pipeline and challenges for epic-kitchens-100

D Damen, H Doughty, GM Farinella, A Furnari… - International Journal of …, 2022 - Springer
This paper introduces the pipeline to extend the largest dataset in egocentric vision, EPIC-
KITCHENS. The effort culminates in EPIC-KITCHENS-100, a collection of 100 hours, 20M …

Skeleton aware multi-modal sign language recognition

S Jiang, B Sun, L Wang, Y Bai… - Proceedings of the …, 2021 - openaccess.thecvf.com
Sign language is commonly used by deaf or speech impaired people to communicate but
requires significant effort to master. Sign Language Recognition (SLR) aims to bridge the …