Recent works have demonstrated that the sample complexity of gradient-based learning of single index models, ie functions that depend on a 1-dimensional projection of the input …
Z Shi, J Wei, Y Liang - arXiv preprint arXiv:2206.01717, 2022 - arxiv.org
An important characteristic of neural networks is their ability to learn representations of the input data with effective features for prediction, which is believed to be a key factor to their …
I Diakonikolas, D Kane, L Ren - International Conference on …, 2023 - proceedings.mlr.press
We study the task of agnostically learning halfspaces under the Gaussian distribution. Specifically, given labeled examples $(\\mathbf {x}, y) $ from an unknown distribution on …
We study gradient flow on the multi-index regression problem for high-dimensional Gaussian data. Multi-index functions consist of a composition of an unknown low-rank linear …
In this work, we provide a characterization of the feature-learning process in two-layer ReLU networks trained by gradient descent on the logistic loss following random initialization. We …
We give the first statistical-query lower bounds for agnostically learning any non-polynomial activation with respect to Gaussian marginals (eg, ReLU, sigmoid, sign). For the specific …
Z Ji, J Li, M Telgarsky - Advances in Neural Information …, 2021 - proceedings.neurips.cc
This work studies the behavior of shallow ReLU networks trained with the logistic loss via gradient descent on binary classification data where the underlying data distribution is …
S Frei, Q Gu - Advances in Neural Information Processing …, 2021 - proceedings.neurips.cc
Although the optimization objectives for learning neural networks are highly non-convex, gradient-based methods have been wildly successful at learning neural networks in …
We study active learning methods for single index models of the form $ F ({\bm x})= f (⟨{\bm w},{\bm x}⟩) $, where $ f:\mathbb {R}\to\mathbb {R} $ and ${\bx,\bm w}\in\mathbb {R}^ d $. In …