Benign overfitting, the phenomenon where interpolating models generalize well in the presence of noisy data, was first observed in neural network models trained with gradient …
Z Shi, J Wei, Y Liang - Advances in Neural Information …, 2023 - proceedings.neurips.cc
Neural networks have achieved remarkable empirical performance, while the current theoretical analysis is not adequate for understanding their success, eg, the Neural Tangent …
Y Kou, Z Chen, Y Chen, Q Gu - International Conference on …, 2023 - proceedings.mlr.press
Modern deep learning models with great expressive power can be trained to overfit the training data but still generalize well. This phenomenon is referred to as benign overfitting …
The implicit biases of gradient-based optimization algorithms are conjectured to be a major factor in the success of modern deep learning. In this work, we investigate the implicit bias of …
In this work, we study first-order algorithms for solving Bilevel Optimization (BO) where the objective functions are smooth but possibly nonconvex in both levels and the variables are …
F Huang - arXiv preprint arXiv:2303.03944, 2023 - arxiv.org
Bilevel optimization is a popular two-level hierarchical optimization, which has been widely applied to many machine learning tasks such as hyperparameter learning, meta learning …
S Frei, D Zou, Z Chen, Q Gu - International Conference on …, 2022 - proceedings.mlr.press
We consider a binary classification problem when the data comes from a mixture of two rotationally symmetric distributions satisfying concentration and anti-concentration …
We study the optimization of wide neural networks (NNs) via gradient flow (GF) in setups that allow feature learning while admitting non-asymptotic global convergence guarantees. First …
F Huang - arXiv preprint arXiv:2303.03984, 2023 - arxiv.org
In the paper, we study a class of nonconvex nonconcave minimax optimization problems (ie, $\min_x\max_y f (x, y) $), where $ f (x, y) $ is possible nonconvex in $ x $, and it is …