Metashift: A dataset of datasets for evaluating contextual distribution shifts and training conflicts

W Liang, J Zou - arXiv preprint arXiv:2202.06523, 2022 - arxiv.org
Understanding the performance of machine learning models across diverse data
distributions is critically important for reliable applications. Motivated by this, there is a …

[PDF][PDF] Patterns of dataset shift

M Kull, P Flach - First international workshop on learning over …, 2014 - dmip.webs.upv.es
Dataset shift is a frequent cause of failure of a predictor. A model which performs well in
several contexts can give bad predictions in other contexts where the data are shifted …

Breeds: Benchmarks for subpopulation shift

S Santurkar, D Tsipras, A Madry - arXiv preprint arXiv:2008.04859, 2020 - arxiv.org
We develop a methodology for assessing the robustness of models to subpopulation shift---
specifically, their ability to generalize to novel data subpopulations that were not observed …

Dataset interfaces: Diagnosing model failures using controllable counterfactual generation

J Vendrow, S Jain, L Engstrom, A Madry - arXiv preprint arXiv:2302.07865, 2023 - arxiv.org
Distribution shift is a major source of failure for machine learning models. However,
evaluating model reliability under distribution shift can be challenging, especially since it …

Examining and combating spurious features under distribution shift

C Zhou, X Ma, P Michel… - … Conference on Machine …, 2021 - proceedings.mlr.press
A central goal of machine learning is to learn robust representations that capture the
fundamental relationship between inputs and output labels. However, minimizing training …

The many faces of robustness: A critical analysis of out-of-distribution generalization

D Hendrycks, S Basart, N Mu… - Proceedings of the …, 2021 - openaccess.thecvf.com
We introduce four new real-world distribution shift datasets consisting of changes in image
style, image blurriness, geographic location, camera operation, and more. With our new …

Extending the wilds benchmark for unsupervised adaptation

S Sagawa, PW Koh, T Lee, I Gao, SM Xie… - arXiv preprint arXiv …, 2021 - arxiv.org
Machine learning systems deployed in the wild are often trained on a source distribution but
deployed on a different target distribution. Unlabeled data can be a powerful point of …

A fine-grained analysis on distribution shift

O Wiles, S Gowal, F Stimberg, S Alvise-Rebuffi… - arXiv preprint arXiv …, 2021 - arxiv.org
Robustness to distribution shifts is critical for deploying machine learning models in the real
world. Despite this necessity, there has been little work in defining the underlying …

Distilling model failures as directions in latent space

S Jain, H Lawrence, A Moitra, A Madry - arXiv preprint arXiv:2206.14754, 2022 - arxiv.org
Existing methods for isolating hard subpopulations and spurious correlations in datasets
often require human intervention. This can make these methods labor-intensive and dataset …

Wilds: A benchmark of in-the-wild distribution shifts

PW Koh, S Sagawa, H Marklund… - International …, 2021 - proceedings.mlr.press
Distribution shifts—where the training distribution differs from the test distribution—can
substantially degrade the accuracy of machine learning (ML) systems deployed in the wild …