Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new …
S Garg, D Tsipras, PS Liang… - Advances in Neural …, 2022 - proceedings.neurips.cc
In-context learning is the ability of a model to condition on a prompt sequence consisting of in-context examples (input-output pairs corresponding to some task) along with a new query …
In many modern applications of deep learning the neural network has many more parameters than the data points used for its training. Motivated by those practices, a large …
Grokking, the unusual phenomenon for algorithmic datasets where generalization happens long after overfitting the training data, has remained elusive. We aim to understand grokking …
The classical bias-variance trade-off predicts that bias decreases and variance increase with model complexity, leading to a U-shaped risk curve. Recent work calls this into question for …
We compare classification and regression tasks in an overparameterized linear model with Gaussian features. On the one hand, we show that with sufficient overparameterization all …
A theoretical understanding of generalization remains an open problem for many machine learning models, including deep networks where overparameterization leads to better …
Z Deng, A Kammoun… - Information and Inference …, 2022 - academic.oup.com
We consider a model for logistic regression where only a subset of features of size is used for training a linear classifier over training samples. The classifier is obtained by running …
Deep neural networks can achieve remarkable generalization performances while interpolating the training data. Rather than the U-curve emblematic of the bias-variance …