Analyzing finite neural networks: Can we trust neural tangent kernel theory?

M Li, M Nica, D Roy - Advances in Neural Information …, 2022 - proceedings.neurips.cc

The logit outputs of a feedforward neural network at initialization are conditionally Gaussian,
given a random covariance matrix defined by the penultimate layer. In this work, we study …

被引用次数：37 相关文章所有 7 个版本

[PDF] mlr.press

Neural tangent kernel beyond the infinite-width limit: Effects of depth and initialization

M Seleznova, G Kutyniok - International Conference on …, 2022 - proceedings.mlr.press

Abstract Neural Tangent Kernel (NTK) is widely used to analyze overparametrized neural
networks due to the famous result by Jacot et al.(2018): in the infinite-width limit, the NTK is …

被引用次数：30 相关文章所有 5 个版本

[PDF] mlr.press

Width and depth limits commute in residual networks

S Hayou, G Yang - International Conference on Machine …, 2023 - proceedings.mlr.press

We show that taking the width and depth to infinity in a deep neural network with skip
connections, when branches are scaled by $1/\sqrt {depth} $, result in the same covariance …

被引用次数：14 相关文章所有 6 个版本

[PDF] neurips.cc

The future is log-Gaussian: ResNets and their infinite-depth-and-width limit at initialization

M Li, M Nica, D Roy - Advances in Neural Information …, 2021 - proceedings.neurips.cc

Theoretical results show that neural networks can be approximated by Gaussian processes
in the infinite-width limit. However, for fully connected networks, it has been previously …

被引用次数：41 相关文章所有 6 个版本

[PDF] arxiv.org

Few-shot backdoor attacks via neural tangent kernels

J Hayase, S Oh - arXiv preprint arXiv:2210.05929, 2022 - arxiv.org

In a backdoor attack, an attacker injects corrupted examples into the training set. The goal of
the attacker is to cause the final trained model to predict the attacker's desired target label …

被引用次数：21 相关文章所有 5 个版本

[PDF] neurips.cc

Neural (tangent kernel) collapse

M Seleznova, D Weitzner, R Giryes… - Advances in …, 2024 - proceedings.neurips.cc

This work bridges two important concepts: the Neural Tangent Kernel (NTK), which captures
the evolution of deep neural networks (DNNs) during training, and the Neural Collapse (NC) …

被引用次数：12 相关文章所有 7 个版本

[PDF] openreview.net

On the infinite-depth limit of finite-width neural networks

S Hayou - Transactions on Machine Learning Research, 2022 - openreview.net

In this paper, we study the infinite-depth limit of finite-width residual neural networks with
random Gaussian weights. With proper scaling, we show that by fixing the width and taking …

被引用次数：25 相关文章所有 3 个版本

[PDF] neurips.cc

Stability and generalization analysis of gradient methods for shallow neural networks

Y Lei, R Jin, Y Ying - Advances in Neural Information …, 2022 - proceedings.neurips.cc

While significant theoretical progress has been achieved, unveiling the generalization
mystery of overparameterized neural networks still remains largely elusive. In this paper, we …

被引用次数：18 相关文章所有 9 个版本

[PDF] neurips.cc

Stability & generalisation of gradient descent for shallow neural networks without the neural tangent kernel

D Richards, I Kuzborskij - Advances in neural information …, 2021 - proceedings.neurips.cc

We revisit on-average algorithmic stability of Gradient Descent (GD) for training
overparameterised shallow neural networks and prove new generalisation and excess risk …

被引用次数：27 相关文章所有 6 个版本

[PDF] mlr.press

Efficient parametric approximations of neural network function space distance

N Dhawan, S Huang, J Bae… - … Conference on Machine …, 2023 - proceedings.mlr.press

It is often useful to compactly summarize important properties of model parameters and
training data so that they can be used later without storing and/or iterating over the entire …

被引用次数：5 相关文章所有 6 个版本

高级搜索

QQ 群