Breeds: Benchmarks for subpopulation shift

S Santurkar, D Tsipras, A Madry - arXiv preprint arXiv:2008.04859, 2020 - arxiv.org
We develop a methodology for assessing the robustness of models to subpopulation shift---
specifically, their ability to generalize to novel data subpopulations that were not observed …

Change is hard: A closer look at subpopulation shift

Y Yang, H Zhang, D Katabi, M Ghassemi - arXiv preprint arXiv:2302.12254, 2023 - arxiv.org
Machine learning models often perform poorly on subgroups that are underrepresented in
the training data. Yet, little is understood on the variation in mechanisms that cause …

Metashift: A dataset of datasets for evaluating contextual distribution shifts and training conflicts

W Liang, J Zou - arXiv preprint arXiv:2202.06523, 2022 - arxiv.org
Understanding the performance of machine learning models across diverse data
distributions is critically important for reliable applications. Motivated by this, there is a …

Extending the wilds benchmark for unsupervised adaptation

S Sagawa, PW Koh, T Lee, I Gao, SM Xie… - arXiv preprint arXiv …, 2021 - arxiv.org
Machine learning systems deployed in the wild are often trained on a source distribution but
deployed on a different target distribution. Unlabeled data can be a powerful point of …

A fine-grained analysis on distribution shift

O Wiles, S Gowal, F Stimberg, S Alvise-Rebuffi… - arXiv preprint arXiv …, 2021 - arxiv.org
Robustness to distribution shifts is critical for deploying machine learning models in the real
world. Despite this necessity, there has been little work in defining the underlying …

Fine-tuning can distort pretrained features and underperform out-of-distribution

A Kumar, A Raghunathan, R Jones, T Ma… - arXiv preprint arXiv …, 2022 - arxiv.org
When transferring a pretrained model to a downstream task, two popular methods are full
fine-tuning (updating all the model parameters) and linear probing (updating only the last …

Wilds: A benchmark of in-the-wild distribution shifts

PW Koh, S Sagawa, H Marklund… - International …, 2021 - proceedings.mlr.press
Distribution shifts—where the training distribution differs from the test distribution—can
substantially degrade the accuracy of machine learning (ML) systems deployed in the wild …

Bias mimicking: A simple sampling approach for bias mitigation

M Qraitem, K Saenko… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Prior work has shown that Visual Recognition datasets frequently underrepresent bias
groups B (eg Female) within class labels Y (eg Programmers). This dataset bias can lead to …

Distilling model failures as directions in latent space

S Jain, H Lawrence, A Moitra, A Madry - arXiv preprint arXiv:2206.14754, 2022 - arxiv.org
Existing methods for isolating hard subpopulations and spurious correlations in datasets
often require human intervention. This can make these methods labor-intensive and dataset …

Umix: Improving importance weighting for subpopulation shift via uncertainty-aware mixup

Z Han, Z Liang, F Yang, L Liu, L Li… - Advances in …, 2022 - proceedings.neurips.cc
Subpopulation shift widely exists in many real-world machine learning applications, referring
to the training and test distributions containing the same subpopulation groups but varying in …