On the implicit bias in deep-learning algorithms

G Vardi - Communications of the ACM, 2023 - dl.acm.org
On the Implicit Bias in Deep-Learning Algorithms Page 1 DEEP LEARNING HAS been highly
successful in recent years and has led to dramatic improvements in multiple domains …

Implicit bias of gradient descent for two-layer reLU and leaky reLU networks on nearly-orthogonal data

Y Kou, Z Chen, Q Gu - Advances in Neural Information …, 2024 - proceedings.neurips.cc
The implicit bias towards solutions with favorable properties is believed to be a key reason
why neural networks trained by gradient-based optimization can generalize well. While the …

Implicit regularization towards rank minimization in relu networks

N Timor, G Vardi, O Shamir - International Conference on …, 2023 - proceedings.mlr.press
We study the conjectured relationship between the implicit regularization in neural networks,
trained with gradient-based methods, and rank minimization of their weight matrices …

Benign overfitting in linear classifiers and leaky relu networks from kkt conditions for margin maximization

S Frei, G Vardi, P Bartlett… - The Thirty Sixth Annual …, 2023 - proceedings.mlr.press
Linear classifiers and leaky ReLU networks trained by gradient flow on the logistic loss have
an implicit bias towards solutions which satisfy the Karush–Kuhn–Tucker (KKT) conditions …

The double-edged sword of implicit bias: Generalization vs. robustness in relu networks

S Frei, G Vardi, P Bartlett… - Advances in Neural …, 2024 - proceedings.neurips.cc
In this work, we study the implications of the implicit bias of gradient flow on generalization
and adversarial robustness in ReLU networks. We focus on a setting where the data …

Understanding multi-phase optimization dynamics and rich nonlinear behaviors of relu networks

M Wang, C Ma - Advances in Neural Information Processing …, 2024 - proceedings.neurips.cc
The training process of ReLU neural networks often exhibits complicated nonlinear
phenomena. The nonlinearity of models and non-convexity of loss pose significant …

Implicit bias in leaky relu networks trained on high-dimensional data

S Frei, G Vardi, PL Bartlett, N Srebro, W Hu - arXiv preprint arXiv …, 2022 - arxiv.org
The implicit biases of gradient-based optimization algorithms are conjectured to be a major
factor in the success of modern deep learning. In this work, we investigate the implicit bias of …

From tempered to benign overfitting in relu neural networks

G Kornowski, G Yehudai… - Advances in Neural …, 2024 - proceedings.neurips.cc
Overparameterized neural networks (NNs) are observed to generalize well even when
trained to perfectly fit noisy data. This phenomenon motivated a large body of work on" …

On margin maximization in linear and relu networks

G Vardi, O Shamir, N Srebro - Advances in Neural …, 2022 - proceedings.neurips.cc
The implicit bias of neural networks has been extensively studied in recent years. Lyu and Li
(2019) showed that in homogeneous networks trained with the exponential or the logistic …

Penalising the biases in norm regularisation enforces sparsity

E Boursier, N Flammarion - Advances in Neural Information …, 2023 - proceedings.neurips.cc
Controlling the parameters' norm often yields good generalisation when training neural
networks. Beyond simple intuitions, the relation between regularising parameters' norm and …