Wide flat minima and optimal generalization in classifying high-dimensional Gaussian mixtures

JW Rocks, P Mehta - Physical review research, 2022 - APS

The bias-variance trade-off is a central concept in supervised learning. In classical statistics,
increasing the complexity of a model (eg, number of parameters) reduces bias but also …

被引用次数：64 相关文章所有 10 个版本

[PDF] arxiv.org

Learning through atypical phase transitions in overparameterized neural networks

C Baldassi, C Lauditi, EM Malatesta, R Pacelli… - Physical Review E, 2022 - APS

Current deep neural networks are highly overparameterized (up to billions of connection
weights) and nonlinear. Yet they can fit data almost perfectly through variants of gradient …

被引用次数：30 相关文章所有 7 个版本

[PDF] arxiv.org

Unveiling the structure of wide flat minima in neural networks

C Baldassi, C Lauditi, EM Malatesta, G Perugini… - Physical Review Letters, 2021 - APS

The success of deep learning has revealed the application potential of neural networks
across the sciences and opened up fundamental theoretical problems. In particular, the fact …

被引用次数：37 相关文章所有 9 个版本

[PDF] aps.org

Bias-variance decomposition of overparameterized regression with random linear features

JW Rocks, P Mehta - Physical Review E, 2022 - APS

In classical statistics, the bias-variance trade-off describes how varying a model's complexity
(eg, number of fit parameters) affects its ability to make accurate predictions. According to …

被引用次数：11 相关文章所有 8 个版本

[PDF] arxiv.org

Typical and atypical solutions in nonconvex neural networks with discrete and continuous weights

C Baldassi, EM Malatesta, G Perugini, R Zecchina - Physical Review E, 2023 - APS

We study the binary and continuous negative-margin perceptrons as simple nonconvex
neural network models learning random rules and associations. We analyze the geometry of …

被引用次数：8 相关文章所有 6 个版本

[PDF] neurips.cc

Classification of heavy-tailed features in high dimensions: a superstatistical approach

U Adomaityte, G Sicuro, P Vivo - Advances in Neural …, 2024 - proceedings.neurips.cc

We characterise the learning of a mixture of two clouds of data points with generic centroids
via empirical risk minimisation in the high dimensional regime, under the assumptions of …

被引用次数：3 相关文章所有 6 个版本

[PDF] arxiv.org

High-dimensional manifold of solutions in neural networks: insights from statistical physics

EM Malatesta - arXiv preprint arXiv:2309.09240, 2023 - arxiv.org

In these pedagogic notes I review the statistical mechanics approach to neural networks,
focusing on the paradigmatic example of the perceptron architecture with binary an …

被引用次数：4 相关文章所有 2 个版本

[HTML] iop.org Full View

[HTML][HTML] The twin peaks of learning neural networks

E Demyanenko, C Feinauer… - Machine …, 2024 - pubishingsupport.iopscience.iop.org

Recent works demonstrated the existence of a double-descent phenomenon for the
generalization error of neural networks, where highly overparameterized models escape …

Solvable model for the linear separability of structured data

M Gherardi - Entropy, 2021 - mdpi.com

Linear separability, a core concept in supervised machine learning, refers to whether the
labels of a data set can be captured by the simplest possible machine: a linear classifier. In …

被引用次数：8 相关文章所有 9 个版本

[PDF] arxiv.org

Star-shaped space of solutions of the spherical negative perceptron

BL Annesi, C Lauditi, C Lucibello, EM Malatesta… - Physical Review Letters, 2023 - APS

Empirical studies on the landscape of neural networks have shown that low-energy
configurations are often found in complex connected structures, where zero-energy paths …

被引用次数：15 相关文章所有 8 个版本

高级搜索

QQ 群