Linear classifiers and leaky ReLU networks trained by gradient flow on the logistic loss have an implicit bias towards solutions which satisfy the Karush–Kuhn–Tucker (KKT) conditions …
Memorization and generalization are complementary cognitive processes that jointly promote adaptive behavior. For example, animals should memorize safe routes to specific …
Y Li, Q Lin - Advances in Neural Information Processing …, 2024 - proceedings.neurips.cc
The widely observed'benign overfitting phenomenon'in the neural network literature raises the challenge to thebias-variance trade-off'doctrine in the statistical learning theory. Since …
Deep learning is a topic of considerable current interest. The availability of massive data collections and powerful software resources has led to an impressive amount of results in …
Overparameterized neural networks (NNs) are observed to generalize well even when trained to perfectly fit noisy data. This phenomenon motivated a large body of work on" …
Neural networks trained by gradient descent (GD) have exhibited a number of surprising generalization behaviors. First, they can achieve a perfect fit to noisy training data and still …
This paper focuses on over-parameterized deep neural networks (DNNs) with ReLU activation functions and proves that when the data distribution is well-separated, DNNs can …
Y Wang, R Sonthalia, W Hu - International Conference on …, 2024 - proceedings.mlr.press
We study the generalization capability of nearly-interpolating linear regressors: ${\beta} $'s whose training error $\tau $ is positive but small, ie, below the noise floor. Under a random …
D Barzilai, O Shamir - arXiv preprint arXiv:2312.15995, 2023 - arxiv.org
It is by now well-established that modern over-parameterized models seem to elude the bias- variance tradeoff and generalize well despite overfitting noise. Many recent works attempt to …