The dollar street dataset: Images representing the geographic and socioeconomic diversity of the world

WAG Rojas, S Diamos, KR Kini, D Kanter… - … -sixth Conference on …, 2022 - openreview.net
It is crucial that image datasets for computer vision are representative and contain accurate
demographic information to ensure their robustness and fairness, especially for smaller …

Webly supervised fine-grained recognition: Benchmark datasets and an approach

Z Sun, Y Yao, XS Wei, Y Zhang… - Proceedings of the …, 2021 - openaccess.thecvf.com
Learning from the web can ease the extreme dependence of deep learning on large-scale
manually labeled datasets. Especially for fine-grained recognition, which targets at …

Fine-grained recognition in the wild: A multi-task domain adaptation approach

T Gebru, J Hoffman, L Fei-Fei - Proceedings of the IEEE …, 2017 - openaccess.thecvf.com
While fine-grained object recognition is an important problem in computer vision, current
models are unlikely to accurately classify objects in the wild. These fully supervised models …

Learning expressive prompting with residuals for vision transformers

R Das, Y Dukler, A Ravichandran… - Proceedings of the …, 2023 - openaccess.thecvf.com
Prompt learning is an efficient approach to adapt transformers by inserting learnable set of
parameters into the input and intermediate representations of a pre-trained model. In this …

Benchmarking representation learning for natural world image collections

G Van Horn, E Cole, S Beery, K Wilber… - Proceedings of the …, 2021 - openaccess.thecvf.com
Recent progress in self-supervised learning has resulted in models that are capable of
extracting rich representations from image collections without requiring any explicit label …

Aerial images processing for car detection using convolutional neural networks: Comparison between faster r-cnn and yolov3

A Ammar, A Koubaa, M Ahmed, A Saad… - arXiv preprint arXiv …, 2019 - arxiv.org
In this paper, we address the problem of car detection from aerial images using
Convolutional Neural Networks (CNN). This problem presents additional challenges as …

Improving visual prompt tuning for self-supervised vision transformers

S Yoo, E Kim, D Jung, J Lee… - … Conference on Machine …, 2023 - proceedings.mlr.press
Abstract Visual Prompt Tuning (VPT) is an effective tuning method for adapting pretrained
Vision Transformers (ViTs) to downstream tasks. It leverages extra learnable tokens, known …

Detecting natural disasters, damage, and incidents in the wild

E Weber, N Marzo, DP Papadopoulos, A Biswas… - Computer Vision–ECCV …, 2020 - Springer
Responding to natural disasters, such as earthquakes, floods, and wildfires, is a laborious
task performed by on-the-ground emergency responders and analysts. Social media has …

Efficient adaptation of large vision transformer via adapter re-composing

W Dong, D Yan, Z Lin, P Wang - Advances in Neural …, 2024 - proceedings.neurips.cc
The advent of high-capacity pre-trained models has revolutionized problem-solving in
computer vision, shifting the focus from training task-specific models to adapting pre-trained …

Pl@ ntNet-300K: a plant image dataset with high label ambiguity and a long-tailed distribution

C Garcin, A Joly, P Bonnet, JC Lombardo… - NeurIPS 2021-35th …, 2021 - inria.hal.science
This paper presents a novel image dataset with high intrinsic ambiguity and a longtailed
distribution built from the database of Pl@ ntNet citizen observatory. It consists of 306,146 …