On the need for a language describing distribution shifts: Illustrations on tabular datasets

J Liu, T Wang, P Cui… - Advances in Neural …, 2024 - proceedings.neurips.cc
Different distribution shifts require different algorithmic and operational interventions.
Methodological research must be grounded by the specific shifts they address. Although …

Active learning on a budget: Opposite strategies suit high and low budgets

G Hacohen, A Dekel, D Weinshall - arXiv preprint arXiv:2202.02794, 2022 - arxiv.org
Investigating active learning, we focus on the relation between the number of labeled
examples (budget size), and suitable querying strategies. Our theoretical analysis shows a …

A survey of deep active learning for foundation models

T Wan, K Xu, T Yu, X Wang, D Feng, B Ding… - Intelligent …, 2023 - spj.science.org
Active learning (AL) is an effective sample selection approach that annotates only a subset
of the training data to address the challenge of data annotation, and deep learning (DL) is …

Active learning through a covering lens

O Yehuda, A Dekel, G Hacohen… - Advances in Neural …, 2022 - proceedings.neurips.cc
Deep active learning aims to reduce the annotation cost for the training of deep models,
which is notoriously data-hungry. Until recently, deep active learning methods were …

Active finetuning: Exploiting annotation budget in the pretraining-finetuning paradigm

Y Xie, H Lu, J Yan, X Yang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Given the large-scale data and the high annotation cost, pretraining-finetuning becomes a
popular paradigm in multiple computer vision tasks. Previous research has covered both the …

Towards free data selection with general-purpose models

Y Xie, M Ding, M Tomizuka… - Advances in Neural …, 2024 - proceedings.neurips.cc
A desirable data selection algorithm can efficiently choose the most informative samples to
maximize the utility of limited annotation budgets. However, current approaches …

A comprehensive survey on deep active learning in medical image analysis

H Wang, Q Jin, S Li, S Liu, M Wang, Z Song - Medical Image Analysis, 2024 - Elsevier
Deep learning has achieved widespread success in medical image analysis, leading to an
increasing demand for large-scale expert-annotated medical image datasets. Yet, the high …

Optimizing data collection for machine learning

R Mahmood, J Lucas, JM Alvarez… - Advances in Neural …, 2022 - proceedings.neurips.cc
Modern deep learning systems require huge data sets to achieve impressive performance,
but there is little guidance on how much or what kind of data to collect. Over-collecting data …

Knowledge-aware federated active learning with non-iid data

YT Cao, Y Shi, B Yu, J Wang… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Federated learning enables multiple decentralized clients to learn collaboratively without
sharing local data. However, the expensive annotation cost on local clients remains an …

How much more data do i need? estimating requirements for downstream tasks

R Mahmood, J Lucas, D Acuna, D Li… - Proceedings of the …, 2022 - openaccess.thecvf.com
Given a small training data set and a learning algorithm, how much more data is necessary
to reach a target validation or test performance? This question is of critical importance in …