Neural network classifiers can largely rely on simple spurious features, such as backgrounds, to make predictions. However, even in these cases, we show that they still …
A common approach to transfer learning under distribution shift is to fine-tune the last few layers of a pre-trained model, preserving learned features while also adapting to the new …
Deep classifiers are known to rely on spurious features—patterns which are correlated with the target on the training data but not inherently relevant to the learning problem, such as the …
Out-of-Distribution (OOD) detection separates ID (In-Distribution) data and OOD data from input data through a model. This problem has attracted increasing attention in the area of …
We study the problem of learning causal representations from unknown, latent interventions in a general setting, where the latent distribution is Gaussian but the mixing function is …
Shortcut Learning of Large Language Models in Natural Language Understanding Page 1 110 COMMUNICATIONS OF THE ACM | JANUARY 2024 | VOL. 67 | NO. 1 research IMA GE B Y …
Machine learning models often perform poorly on subgroups that are underrepresented in the training data. Yet, little is understood on the variation in mechanisms that cause …
We study neural network loss landscapes through the lens of mode connectivity, the observation that minimizers of neural networks retrieved via training on a dataset are …
Recently, there has been a growing surge of interest in enabling machine learning systems to generalize well to Out-of-Distribution (OOD) data. Most efforts are devoted to advancing …