The neural covariance SDE: Shaped infinite depth-and-width networks at initialization

M Li, M Nica, D Roy - Advances in Neural Information …, 2022 - proceedings.neurips.cc
The logit outputs of a feedforward neural network at initialization are conditionally Gaussian,
given a random covariance matrix defined by the penultimate layer. In this work, we study …

Neural tangent kernel beyond the infinite-width limit: Effects of depth and initialization

M Seleznova, G Kutyniok - International Conference on …, 2022 - proceedings.mlr.press
Abstract Neural Tangent Kernel (NTK) is widely used to analyze overparametrized neural
networks due to the famous result by Jacot et al.(2018): in the infinite-width limit, the NTK is …

Width and depth limits commute in residual networks

S Hayou, G Yang - International Conference on Machine …, 2023 - proceedings.mlr.press
We show that taking the width and depth to infinity in a deep neural network with skip
connections, when branches are scaled by $1/\sqrt {depth} $, result in the same covariance …

The future is log-Gaussian: ResNets and their infinite-depth-and-width limit at initialization

M Li, M Nica, D Roy - Advances in Neural Information …, 2021 - proceedings.neurips.cc
Theoretical results show that neural networks can be approximated by Gaussian processes
in the infinite-width limit. However, for fully connected networks, it has been previously …

Few-shot backdoor attacks via neural tangent kernels

J Hayase, S Oh - arXiv preprint arXiv:2210.05929, 2022 - arxiv.org
In a backdoor attack, an attacker injects corrupted examples into the training set. The goal of
the attacker is to cause the final trained model to predict the attacker's desired target label …

Neural (tangent kernel) collapse

M Seleznova, D Weitzner, R Giryes… - Advances in …, 2024 - proceedings.neurips.cc
This work bridges two important concepts: the Neural Tangent Kernel (NTK), which captures
the evolution of deep neural networks (DNNs) during training, and the Neural Collapse (NC) …

On the infinite-depth limit of finite-width neural networks

S Hayou - Transactions on Machine Learning Research, 2022 - openreview.net
In this paper, we study the infinite-depth limit of finite-width residual neural networks with
random Gaussian weights. With proper scaling, we show that by fixing the width and taking …

Stability and generalization analysis of gradient methods for shallow neural networks

Y Lei, R Jin, Y Ying - Advances in Neural Information …, 2022 - proceedings.neurips.cc
While significant theoretical progress has been achieved, unveiling the generalization
mystery of overparameterized neural networks still remains largely elusive. In this paper, we …

Stability & generalisation of gradient descent for shallow neural networks without the neural tangent kernel

D Richards, I Kuzborskij - Advances in neural information …, 2021 - proceedings.neurips.cc
We revisit on-average algorithmic stability of Gradient Descent (GD) for training
overparameterised shallow neural networks and prove new generalisation and excess risk …

Efficient parametric approximations of neural network function space distance

N Dhawan, S Huang, J Bae… - … Conference on Machine …, 2023 - proceedings.mlr.press
It is often useful to compactly summarize important properties of model parameters and
training data so that they can be used later without storing and/or iterating over the entire …