Depth separation beyond radial functions

I Safran, J Lee - Conference on Learning Theory, 2022 - proceedings.mlr.press

Depth separation results propose a possible theoretical explanation for the benefits of deep
neural networks over shallower architectures, establishing that the former possess superior …

被引用次数：15 相关文章所有 4 个版本

[PDF] mlr.press

Width is less important than depth in relu neural networks

G Vardi, G Yehudai, O Shamir - Conference on learning …, 2022 - proceedings.mlr.press

We solve an open question from Lu et al.(2017), by showing that any target network with
inputs in $\mathbb {R}^ d $ can be approximated by a width $ O (d) $ network (independent …

被引用次数：14 相关文章所有 3 个版本

[PDF] arxiv.org

On the optimal memorization power of relu neural networks

G Vardi, G Yehudai, O Shamir - arXiv preprint arXiv:2110.03187, 2021 - arxiv.org

We study the memorization power of feedforward ReLU neural networks. We show that such
networks can memorize any $ N $ points that satisfy a mild separability assumption using …

被引用次数：21 相关文章所有 4 个版本

[PDF] neurips.cc

Exponential separations in symmetric neural networks

A Zweig, J Bruna - Advances in Neural Information …, 2022 - proceedings.neurips.cc

In this work we demonstrate a novel separation between symmetric neural network
architectures. Specifically, we consider the Relational Network~\parencite …

被引用次数：9 相关文章所有 7 个版本

[PDF] arxiv.org

Depth Separation in Norm-Bounded Infinite-Width Neural Networks

S Parkinson, G Ongie, R Willett, O Shamir… - arXiv preprint arXiv …, 2024 - arxiv.org

We study depth separation in infinite-width neural networks, where complexity is controlled
by the overall squared $\ell_2 $-norm of the weights (sum of squares of all weights in the …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

The necessity of depth for artificial neural networks to approximate certain classes of smooth and bounded functions without the curse of dimensionality

L Gonon, R Graeber, A Jentzen - arXiv preprint arXiv:2301.08284, 2023 - arxiv.org

In this article we study high-dimensional approximation capacities of shallow and deep
artificial neural networks (ANNs) with the rectified linear unit (ReLU) activation. In particular …

被引用次数：5 相关文章所有 3 个版本

[PDF] siam.org

How Many Neurons Does it Take to Approximate the Maximum?

I Safran, D Reichman, P Valiant - Proceedings of the 2024 Annual ACM-SIAM …, 2024 - SIAM

We study the size of a neural network needed to approximate the maximum function over d
inputs, in the most basic setting of approximating with respect to the L 2 norm, for continuous …

被引用次数：2 相关文章所有 4 个版本

[PDF] jmlr.org

Optimal Bump Functions for Shallow ReLU networks: Weight Decay, Depth Separation, Curse of Dimensionality

S Wojtowytsch - Journal of Machine Learning Research, 2024 - jmlr.org

In this note, we study how neural networks with a single hidden layer and ReLU activation
interpolate data drawn from a radially symmetric distribution with target labels 1 at the origin …

被引用次数：1 相关文章

[PDF] arxiv.org

Spectral complexity of deep neural networks

S Di Lillo, D Marinucci, M Salvi, S Vigogna - arXiv preprint arXiv …, 2024 - arxiv.org

It is well-known that randomly initialized, push-forward, fully-connected neural networks
weakly converge to isotropic Gaussian processes, in the limit where the width of all layers …

Rethink depth separation with intra-layer links

FL Fan, ZY Li, H Xiong, T Zeng - arXiv preprint arXiv:2305.07037, 2023 - arxiv.org

The depth separation theory is nowadays widely accepted as an effective explanation for the
power of depth, which consists of two parts: i) there exists a function representable by a …

被引用次数：2 相关文章所有 4 个版本

高级搜索

QQ 群