相关文章- 学术资源搜索

[HTML][HTML] High-dimensional dynamics of generalization error in neural networks

MS Advani, AM Saxe, H Sompolinsky - Neural Networks, 2020 - Elsevier

We perform an analysis of the average generalization dynamics of large neural networks
trained using gradient descent. We study the practically-relevant “high-dimensional” regime …

被引用次数：482 相关文章所有 8 个版本

[PDF] academia.edu

How initial conditions affect generalization performance in large networks

A Atiya, C Ji - IEEE transactions on neural networks, 1997 - ieeexplore.ieee.org

Generalization is one of the most important problems in neural-network research. It is
influenced by several factors in the network design, such as network size, weight decay …

被引用次数：99 相关文章所有 11 个版本

[PDF] mlr.press

Generalization error of generalized linear models in high dimensions

M Emami, M Sahraee-Ardakan… - International …, 2020 - proceedings.mlr.press

At the heart of machine learning lies the question of generalizability of learned rules over
previously unseen data. While over-parameterized models based on neural networks are …

被引用次数：48 相关文章所有 10 个版本

[PDF] neurips.cc

Dynamics of stochastic gradient descent for two-layer neural networks in the teacher-student setup

S Goldt, M Advani, AM Saxe… - Advances in neural …, 2019 - proceedings.neurips.cc

Deep neural networks achieve stellar generalisation even when they have enough
parameters to easily fit all their training data. We study this phenomenon by analysing the …

被引用次数：148 相关文章所有 21 个版本

[PDF] neurips.cc

Wide neural networks of any depth evolve as linear models under gradient descent

J Lee, L Xiao, S Schoenholz, Y Bahri… - Advances in neural …, 2019 - proceedings.neurips.cc

A longstanding goal in deep learning research has been to precisely characterize training
and generalization. However, the often complex loss landscapes of neural networks have …

被引用次数：1041 相关文章所有 13 个版本

[PDF] arxiv.org

Generalization in deep networks: The role of distance from initialization

V Nagarajan, JZ Kolter - arXiv preprint arXiv:1901.01672, 2019 - arxiv.org

Why does training deep neural networks using stochastic gradient descent (SGD) result in a
generalization error that does not worsen with the number of parameters in the network? To …

被引用次数：93 相关文章所有 4 个版本

[PDF] mlr.press

Rethinking bias-variance trade-off for generalization of neural networks

Z Yang, Y Yu, C You, J Steinhardt… - … on Machine Learning, 2020 - proceedings.mlr.press

The classical bias-variance trade-off predicts that bias decreases and variance increase with
model complexity, leading to a U-shaped risk curve. Recent work calls this into question for …

被引用次数：202 相关文章所有 6 个版本

[PDF] openreview.net

Generalization of two-layer neural networks: An asymptotic viewpoint

J Ba, M Erdogdu, T Suzuki, D Wu… - … conference on learning …, 2020 - openreview.net

This paper investigates the generalization properties of two-layer neural networks in high-
dimensions, ie when the number of samples $ n $, features $ d $, and neurons $ h $ tend to …

被引用次数：84 相关文章所有 7 个版本

[PDF] neurips.cc

What can linearized neural networks actually say about generalization?

G Ortiz-Jiménez… - Advances in Neural …, 2021 - proceedings.neurips.cc

For certain infinitely-wide neural networks, the neural tangent kernel (NTK) theory fully
characterizes generalization, but for the networks used in practice, the empirical NTK only …

被引用次数：37 相关文章所有 10 个版本

[HTML] nature.com Full View

[HTML][HTML] Spectral bias and task-model alignment explain generalization in kernel regression and infinitely wide neural networks

A Canatar, B Bordelon, C Pehlevan - Nature communications, 2021 - nature.com

A theoretical understanding of generalization remains an open problem for many machine
learning models, including deep networks where overparameterization leads to better …

被引用次数：149 相关文章所有 14 个版本

高级搜索

QQ 群

[HTML][HTML] High-dimensional dynamics of generalization error in neural networks

How initial conditions affect generalization performance in large networks

Generalization error of generalized linear models in high dimensions

Dynamics of stochastic gradient descent for two-layer neural networks in the teacher-student setup

Wide neural networks of any depth evolve as linear models under gradient descent

Generalization in deep networks: The role of distance from initialization

Rethinking bias-variance trade-off for generalization of neural networks

Generalization of two-layer neural networks: An asymptotic viewpoint

What can linearized neural networks actually say about generalization?

[HTML][HTML] Spectral bias and task-model alignment explain generalization in kernel regression and infinitely wide neural networks

相关搜索

引用