Agreement-on-the-line: Predicting the performance of neural networks under distribution shift

C Baek, Y Jiang, A Raghunathan… - Advances in Neural …, 2022 - proceedings.neurips.cc
Recently, Miller et al. showed that a model's in-distribution (ID) accuracy has a strong linear
correlation with its out-of-distribution (OOD) accuracy, on several OOD benchmarks, a …

Leveraging unlabeled data to predict out-of-distribution performance

S Garg, S Balakrishnan, ZC Lipton… - arXiv preprint arXiv …, 2022 - arxiv.org
Real-world machine learning deployments are characterized by mismatches between the
source (training) and target (test) distributions that may cause performance drops. In this …

Datamodels: Predicting predictions from training data

A Ilyas, SM Park, L Engstrom, G Leclerc… - arXiv preprint arXiv …, 2022 - arxiv.org
We present a conceptual framework, datamodeling, for analyzing the behavior of a model
class in terms of the training data. For any fixed" target" example $ x $, training set $ S $, and …

Towards last-layer retraining for group robustness with fewer annotations

T LaBonte, V Muthukumar… - Advances in Neural …, 2024 - proceedings.neurips.cc
Empirical risk minimization (ERM) of neural networks is prone to over-reliance on spurious
correlations and poor generalization on minority groups. The recent deep feature …

Detecting errors and estimating accuracy on unlabeled data with self-training ensembles

J Chen, F Liu, B Avci, X Wu… - Advances in Neural …, 2021 - proceedings.neurips.cc
When a deep learning model is deployed in the wild, it can encounter test data drawn from
distributions different from the training data distribution and suffer drop in performance. For …

Modeldiff: A framework for comparing learning algorithms

H Shah, SM Park, A Ilyas… - … Conference on Machine …, 2023 - proceedings.mlr.press
We study the problem of (learning) algorithm comparison, where the goal is to find
differences between models trained with two different learning algorithms. We begin by …

Rlsbench: Domain adaptation under relaxed label shift

S Garg, N Erickson, J Sharpnack… - International …, 2023 - proceedings.mlr.press
Despite the emergence of principled methods for domain adaptation under label shift, their
sensitivity to shifts in class conditional distributions is precariously under explored …

Gnnevaluator: Evaluating gnn performance on unseen graphs without labels

X Zheng, M Zhang, C Chen, S Molaei… - Advances in Neural …, 2024 - proceedings.neurips.cc
Evaluating the performance of graph neural networks (GNNs) is an essential task for
practical GNN model deployment and serving, as deployed GNNs face significant …

A survey on evaluation of out-of-distribution generalization

H Yu, J Liu, X Zhang, J Wu, P Cui - arXiv preprint arXiv:2403.01874, 2024 - arxiv.org
Machine learning models, while progressively advanced, rely heavily on the IID assumption,
which is often unfulfilled in practice due to inevitable distribution shifts. This renders them …

Predicting out-of-distribution error with the projection norm

Y Yu, Z Yang, A Wei, Y Ma… - … Conference on Machine …, 2022 - proceedings.mlr.press
We propose a metric—Projection Norm—to predict a model's performance on out-of-
distribution (OOD) data without access to ground truth labels. Projection Norm first uses …