Distributionally robust neural networks for group shifts: On the importance of regularization for worst-case generalization

S Sagawa, PW Koh, TB Hashimoto, P Liang - arXiv preprint arXiv …, 2019 - arxiv.org
Overparameterized neural networks can be highly accurate on average on an iid test set yet
consistently fail on atypical groups of the data (eg, by learning spurious correlations that …

Examining and combating spurious features under distribution shift

C Zhou, X Ma, P Michel… - … Conference on Machine …, 2021 - proceedings.mlr.press
A central goal of machine learning is to learn robust representations that capture the
fundamental relationship between inputs and output labels. However, minimizing training …

Calibrated ensembles can mitigate accuracy tradeoffs under distribution shift

A Kumar, T Ma, P Liang… - Uncertainty in Artificial …, 2022 - proceedings.mlr.press
We often see undesirable tradeoffs in robust machine learning where out-of-distribution
(OOD) accuracy is at odds with in-distribution (ID) accuracy. A robust classifier obtained via …

Towards last-layer retraining for group robustness with fewer annotations

T LaBonte, V Muthukumar… - Advances in Neural …, 2024 - proceedings.neurips.cc
Empirical risk minimization (ERM) of neural networks is prone to over-reliance on spurious
correlations and poor generalization on minority groups. The recent deep feature …

Diversify and disambiguate: Out-of-distribution robustness via disagreement

Y Lee, H Yao, C Finn - The Eleventh International Conference on …, 2023 - openreview.net
Real-world machine learning problems often exhibit shifts between the source and target
distributions, in which source data does not fully convey the desired behavior on target …

Improving out-of-distribution robustness via selective augmentation

H Yao, Y Wang, S Li, L Zhang… - International …, 2022 - proceedings.mlr.press
Abstract Machine learning algorithms typically assume that training and test examples are
drawn from the same distribution. However, distribution shift is a common problem in real …

Diverse weight averaging for out-of-distribution generalization

A Rame, M Kirchmeyer, T Rahier… - Advances in …, 2022 - proceedings.neurips.cc
Standard neural networks struggle to generalize under distribution shifts in computer vision.
Fortunately, combining multiple networks can consistently improve out-of-distribution …

Evading the simplicity bias: Training a diverse set of models discovers solutions with superior ood generalization

D Teney, E Abbasnejad, S Lucey… - Proceedings of the …, 2022 - openaccess.thecvf.com
Neural networks trained with SGD were recently shown to rely preferentially on linearly-
predictive features and can ignore complex, equally-predictive ones. This simplicity bias can …

Bayesian invariant risk minimization

Y Lin, H Dong, H Wang… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
Generalization under distributional shift is an open challenge for machine learning. Invariant
Risk Minimization (IRM) is a promising framework to tackle this issue by extracting invariant …

When does preconditioning help or hurt generalization?

S Amari, J Ba, R Grosse, X Li, A Nitanda… - arXiv preprint arXiv …, 2020 - arxiv.org
While second order optimizers such as natural gradient descent (NGD) often speed up
optimization, their effect on generalization has been called into question. This work presents …