The spectrum of random inner-product kernel matrices

PL Bartlett, A Montanari, A Rakhlin - Acta numerica, 2021 - cambridge.org

The remarkable practical success of deep learning has revealed some major surprises from
a theoretical perspective. In particular, simple gradient methods easily find near-optimal …

被引用次数：326 相关文章所有 12 个版本

[PDF] neurips.cc

High-dimensional asymptotics of feature learning: How one gradient step improves the representation

J Ba, MA Erdogdu, T Suzuki, Z Wang… - Advances in Neural …, 2022 - proceedings.neurips.cc

We study the first gradient descent step on the first-layer parameters $\boldsymbol {W} $ in a
two-layer neural network: $ f (\boldsymbol {x})=\frac {1}{\sqrt {N}}\boldsymbol {a}^\top\sigma …

被引用次数：110 相关文章所有 9 个版本

[PDF] neurips.cc

Learning in the presence of low-dimensional structure: a spiked random matrix perspective

J Ba, MA Erdogdu, T Suzuki… - Advances in Neural …, 2024 - proceedings.neurips.cc

We consider the learning of a single-index target function $ f_*:\mathbb {R}^ d\to\mathbb {R}
$ under spiked covariance data: $$ f_*(\boldsymbol {x})=\textstyle\sigma_*(\frac {1}{\sqrt …

被引用次数：21 相关文章所有 5 个版本

[PDF] arxiv.org

The generalization error of random features regression: Precise asymptotics and the double descent curve

S Mei, A Montanari - Communications on Pure and Applied …, 2022 - Wiley Online Library

Deep learning methods operate in regimes that defy the traditional statistical mindset.
Neural network architectures often contain more parameters than training samples, and are …

被引用次数：644 相关文章所有 9 个版本

[HTML] nih.gov

[HTML][HTML] Surprises in high-dimensional ridgeless least squares interpolation

T Hastie, A Montanari, S Rosset, RJ Tibshirani - Annals of statistics, 2022 - ncbi.nlm.nih.gov

Interpolators—estimators that achieve zero training error—have attracted growing attention
in machine learning, mainly because state-of-the art neural networks appear to be models of …

被引用次数：862 相关文章所有 16 个版本

[PDF] neurips.cc

Learning curves of generic features maps for realistic datasets with a teacher-student model

B Loureiro, C Gerbelot, H Cui, S Goldt… - Advances in …, 2021 - proceedings.neurips.cc

Teacher-student models provide a framework in which the typical-case performance of high-
dimensional supervised learning can be described in closed form. The assumptions of …

被引用次数：140 相关文章所有 13 个版本

[PDF] royalsocietypublishing.org

Complex systems in ecology: a guided tour with large Lotka–Volterra models and random matrices

I Akjouj, M Barbier, M Clenet… - … of the Royal …, 2024 - royalsocietypublishing.org

Ecosystems represent archetypal complex dynamical systems, often modelled by coupled
differential equations of the form dxidt= xi ϕ i (x 1,…, x N), where N represents the number of …

被引用次数：17 相关文章所有 28 个版本

[PDF] arxiv.org

Universality laws for high-dimensional learning with random features

H Hu, YM Lu - IEEE Transactions on Information Theory, 2022 - ieeexplore.ieee.org

We prove a universality theorem for learning with random features. Our result shows that, in
terms of training and generalization errors, a random feature model with a nonlinear …

被引用次数：149 相关文章所有 7 个版本

[PDF] mlr.press

Generalisation error in learning with random features and the hidden manifold model

F Gerace, B Loureiro, F Krzakala… - International …, 2020 - proceedings.mlr.press

We study generalised linear regression and classification for a synthetically generated
dataset encompassing different problems of interest, such as learning with random features …

被引用次数：176 相关文章所有 15 个版本

[PDF] aps.org

Modeling the influence of data structure on learning in neural networks: The hidden manifold model

S Goldt, M Mézard, F Krzakala, L Zdeborová - Physical Review X, 2020 - APS

Understanding the reasons for the success of deep neural networks trained using stochastic
gradient-based methods is a key open problem for the nascent theory of deep learning. The …

被引用次数：182 相关文章所有 17 个版本

高级搜索

QQ 群