Beyond lazy training for over-parameterized tensor decomposition

A Damian, J Lee… - Conference on Learning …, 2022 - proceedings.mlr.press

Significant theoretical work has established that in specific regimes, neural networks trained
by gradient descent behave like kernel methods. However, in practice, it is known that …

被引用次数：133 相关文章所有 6 个版本

[PDF] neurips.cc

Small random initialization is akin to spectral learning: Optimization and generalization guarantees for overparameterized low-rank matrix reconstruction

D Stöger, M Soltanolkotabi - Advances in Neural …, 2021 - proceedings.neurips.cc

Recently there has been significant theoretical progress on understanding the convergence
and generalization of gradient-based methods on nonconvex losses with overparameterized …

被引用次数：98 相关文章所有 9 个版本

[PDF] neurips.cc

Local signal adaptivity: Provable feature learning in neural networks beyond kernels

S Karp, E Winston, Y Li, A Singh - Advances in Neural …, 2021 - proceedings.neurips.cc

Neural networks have been shown to outperform kernel methods in practice (including
neural tangent kernels). Most theoretical explanations of this performance gap focus on …

被引用次数：37 相关文章所有 5 个版本

[PDF] neurips.cc

Understanding deflation process in over-parametrized tensor decomposition

R Ge, Y Ren, X Wang, M Zhou - Advances in Neural …, 2021 - proceedings.neurips.cc

In this paper we study the training dynamics for gradient flow on over-parametrized tensor
decomposition problems. Empirically, such training process often first fits larger components …

被引用次数：23 相关文章所有 7 个版本

[PDF] neurips.cc

What Makes Data Suitable for a Locally Connected Neural Network? A Necessary and Sufficient Condition Based on Quantum Entanglement.

N De La Vega, N Razin… - Advances in Neural …, 2024 - proceedings.neurips.cc

The question of what makes a data distribution suitable for deep learning is a fundamental
open problem. Focusing on locally connected neural networks (a prevalent family of …

[PDF] neurips.cc

Optimal gradient-based algorithms for non-concave bandit optimization

B Huang, K Huang, S Kakade, JD Lee… - Advances in …, 2021 - proceedings.neurips.cc

Bandit problems with linear or concave reward have been extensively studied, but relatively
few works have studied bandits with non-concave reward. This work considers a large family …

被引用次数：17 相关文章所有 11 个版本

[PDF] arxiv.org

Behind the scenes of gradient descent: A trajectory analysis via basis function decomposition

J Ma, L Guo, S Fattahi - arXiv preprint arXiv:2210.00346, 2022 - arxiv.org

This work analyzes the solution trajectory of gradient-based algorithms via a novel basis
function decomposition. We show that, although solution trajectories of gradient-based …

被引用次数：8 相关文章所有 4 个版本

[PDF] neurips.cc

Going beyond linear rl: Sample efficient neural function approximation

B Huang, K Huang, S Kakade, JD Lee… - Advances in …, 2021 - proceedings.neurips.cc

Abstract Deep Reinforcement Learning (RL) powered by neural net approximation of the Q
function has had enormous empirical success. While the theory of RL has traditionally …

被引用次数：10 相关文章所有 11 个版本

[PDF] arxiv.org

Implicit regularization for group sparsity

J Li, TV Nguyen, C Hegde, RKW Wong - arXiv preprint arXiv:2301.12540, 2023 - arxiv.org

We study the implicit regularization of gradient descent towards structured sparsity via a
novel neural reparameterization, which we call a diagonally grouped linear neural network …

被引用次数：9 相关文章所有 4 个版本

[PDF] arxiv.org

Implicit Regularization for Tubal Tensor Factorizations via Gradient Descent

S Karnik, A Veselovska, M Iwen, F Krahmer - arXiv preprint arXiv …, 2024 - arxiv.org

We provide a rigorous analysis of implicit regularization in an overparametrized tensor
factorization problem beyond the lazy training regime. For matrix factorization problems, this …

高级搜索

QQ 群