We solve an open question from Lu et al.(2017), by showing that any target network with inputs in $\mathbb {R}^ d $ can be approximated by a width $ O (d) $ network (independent …
We study the memorization power of feedforward ReLU neural networks. We show that such networks can memorize any $ N $ points that satisfy a mild separability assumption using …
A Zweig, J Bruna - Advances in Neural Information …, 2022 - proceedings.neurips.cc
In this work we demonstrate a novel separation between symmetric neural network architectures. Specifically, we consider the Relational Network~\parencite …
We study depth separation in infinite-width neural networks, where complexity is controlled by the overall squared $\ell_2 $-norm of the weights (sum of squares of all weights in the …
L Gonon, R Graeber, A Jentzen - arXiv preprint arXiv:2301.08284, 2023 - arxiv.org
In this article we study high-dimensional approximation capacities of shallow and deep artificial neural networks (ANNs) with the rectified linear unit (ReLU) activation. In particular …
We study the size of a neural network needed to approximate the maximum function over d inputs, in the most basic setting of approximating with respect to the L 2 norm, for continuous …
S Wojtowytsch - Journal of Machine Learning Research, 2024 - jmlr.org
In this note, we study how neural networks with a single hidden layer and ReLU activation interpolate data drawn from a radially symmetric distribution with target labels 1 at the origin …
It is well-known that randomly initialized, push-forward, fully-connected neural networks weakly converge to isotropic Gaussian processes, in the limit where the width of all layers …
FL Fan, ZY Li, H Xiong, T Zeng - arXiv preprint arXiv:2305.07037, 2023 - arxiv.org
The depth separation theory is nowadays widely accepted as an effective explanation for the power of depth, which consists of two parts: i) there exists a function representable by a …