Agnostic learning of a single neuron with gradient descent

B Barak, B Edelman, S Goel… - Advances in …, 2022 - proceedings.neurips.cc

There is mounting evidence of emergent phenomena in the capabilities of deep learning
methods as we scale up datasets, model sizes, and training times. While there are some …

被引用次数：135 相关文章所有 8 个版本

[PDF] neurips.cc

Gradient-based feature learning under structured data

A Mousavi-Hosseini, D Wu, T Suzuki… - Advances in Neural …, 2023 - proceedings.neurips.cc

Recent works have demonstrated that the sample complexity of gradient-based learning of
single index models, ie functions that depend on a 1-dimensional projection of the input …

被引用次数：18 相关文章所有 8 个版本

[PDF] arxiv.org

A theoretical analysis on feature learning in neural networks: Emergence from inputs and advantage over fixed features

Z Shi, J Wei, Y Liang - arXiv preprint arXiv:2206.01717, 2022 - arxiv.org

An important characteristic of neural networks is their ability to learn representations of the
input data with effective features for prediction, which is believed to be a key factor to their …

被引用次数：62 相关文章所有 6 个版本

[PDF] mlr.press

Near-optimal cryptographic hardness of agnostically learning halfspaces and relu regression under gaussian marginals

I Diakonikolas, D Kane, L Ren - International Conference on …, 2023 - proceedings.mlr.press

We study the task of agnostically learning halfspaces under the Gaussian distribution.
Specifically, given labeled examples $(\\mathbf {x}, y) $ from an unknown distribution on …

被引用次数：26 相关文章所有 8 个版本

[PDF] arxiv.org

On learning gaussian multi-index models with gradient flow

A Bietti, J Bruna, L Pillaud-Vivien - arXiv preprint arXiv:2310.19793, 2023 - arxiv.org

We study gradient flow on the multi-index regression problem for high-dimensional
Gaussian data. Multi-index functions consist of a composition of an unknown low-rank linear …

被引用次数：26 相关文章所有 3 个版本

[PDF] jmlr.org

Random feature amplification: Feature learning and generalization in neural networks

S Frei, NS Chatterji, PL Bartlett - Journal of Machine Learning Research, 2023 - jmlr.org

In this work, we provide a characterization of the feature-learning process in two-layer ReLU
networks trained by gradient descent on the logistic loss following random initialization. We …

被引用次数：33 相关文章所有 4 个版本

[PDF] neurips.cc

Statistical-query lower bounds via functional gradients

S Goel, A Gollakota, A Klivans - Advances in Neural …, 2020 - proceedings.neurips.cc

We give the first statistical-query lower bounds for agnostically learning any non-polynomial
activation with respect to Gaussian marginals (eg, ReLU, sigmoid, sign). For the specific …

被引用次数：67 相关文章所有 6 个版本

[PDF] neurips.cc

Early-stopped neural networks are consistent

Z Ji, J Li, M Telgarsky - Advances in Neural Information …, 2021 - proceedings.neurips.cc

This work studies the behavior of shallow ReLU networks trained with the logistic loss via
gradient descent on binary classification data where the underlying data distribution is …

被引用次数：44 相关文章所有 10 个版本

[PDF] neurips.cc

Proxy convexity: A unified framework for the analysis of neural networks trained by gradient descent

S Frei, Q Gu - Advances in Neural Information Processing …, 2021 - proceedings.neurips.cc

Although the optimization objectives for learning neural networks are highly non-convex,
gradient-based methods have been wildly successful at learning neural networks in …

被引用次数：29 相关文章所有 7 个版本

[PDF] mlr.press

Agnostic active learning of single index models with linear sample complexity

A Gajjar, WM Tai, X Xingyu, C Hegde… - The Thirty Seventh …, 2024 - proceedings.mlr.press

We study active learning methods for single index models of the form $ F ({\bm x})= f (⟨{\bm
w},{\bm x}⟩) $, where $ f:\mathbb {R}\to\mathbb {R} $ and ${\bx,\bm w}\in\mathbb {R}^ d $. In …

被引用次数：4 相关文章所有 2 个版本

高级搜索

QQ 群