- 学术资源搜索

Toward a theoretical foundation of policy optimization for learning control policies

B Hu, K Zhang, N Li, M Mesbahi… - Annual Review of …, 2023 - annualreviews.org

Gradient-based methods have been widely used for system design and optimization in
diverse application domains. Recently, there has been a renewed interest in studying …

被引用次数：60 相关文章所有 6 个版本

[PDF] ieee.org

Nonconvex optimization meets low-rank matrix factorization: An overview

Y Chi, YM Lu, Y Chen - IEEE Transactions on Signal …, 2019 - ieeexplore.ieee.org

Substantial progress has been made recently on developing provably accurate and efficient
algorithms for low-rank matrix factorization via nonconvex optimization. While conventional …

被引用次数：472 相关文章所有 13 个版本

[PDF] neurips.cc

Gradient starvation: A learning proclivity in neural networks

M Pezeshki, O Kaba, Y Bengio… - Advances in …, 2021 - proceedings.neurips.cc

We identify and formalize a fundamental gradient descent phenomenon resulting in a
learning proclivity in over-parameterized neural networks. Gradient Starvation arises when …

被引用次数：253 相关文章所有 7 个版本

[PDF] neurips.cc

Implicit regularization in deep matrix factorization

S Arora, N Cohen, W Hu, Y Luo - Advances in Neural …, 2019 - proceedings.neurips.cc

Efforts to understand the generalization mystery in deep learning have led to the belief that
gradient-based optimization induces a form of implicit regularization, a bias towards models …

被引用次数：518 相关文章所有 10 个版本

[PDF] neurips.cc

Learning overparameterized neural networks via stochastic gradient descent on structured data

Y Li, Y Liang - Advances in neural information processing …, 2018 - proceedings.neurips.cc

Neural networks have many successful applications, while much less theoretical
understanding has been gained. Towards bridging this gap, we study the problem of …

被引用次数：698 相关文章所有 8 个版本

[PDF] arxiv.org

Lower bounds for non-convex stochastic optimization

Y Arjevani, Y Carmon, JC Duchi, DJ Foster… - Mathematical …, 2023 - Springer

We lower bound the complexity of finding ϵ-stationary points (with gradient norm at most ϵ)
using stochastic first-order methods. In a well-studied model where algorithms access …

被引用次数：309 相关文章所有 6 个版本

[PDF] nowpublishers.com

Spectral methods for data science: A statistical perspective

Y Chen, Y Chi, J Fan, C Ma - Foundations and Trends® in …, 2021 - nowpublishers.com

Spectral methods have emerged as a simple yet surprisingly effective approach for
extracting information from massive, noisy and incomplete data. In a nutshell, spectral …

被引用次数：145 相关文章所有 7 个版本

[PDF] arxiv.org

Gradient descent maximizes the margin of homogeneous neural networks

K Lyu, J Li - arXiv preprint arXiv:1906.05890, 2019 - arxiv.org

In this paper, we study the implicit regularization of the gradient descent algorithm in
homogeneous neural networks, including fully-connected and convolutional neural …

被引用次数：303 相关文章所有 3 个版本

[HTML] nih.gov

[HTML][HTML] Entrywise eigenvector analysis of random matrices with low expected rank

E Abbe, J Fan, K Wang, Y Zhong - Annals of statistics, 2020 - ncbi.nlm.nih.gov

Recovering low-rank structures via eigenvector perturbation analysis is a common problem
in statistical machine learning, such as in factor analysis, community detection, ranking …

被引用次数：305 相关文章所有 13 个版本

[PDF] mlr.press

MARINA: Faster non-convex distributed learning with compression

E Gorbunov, KP Burlachenko, Z Li… - … on Machine Learning, 2021 - proceedings.mlr.press

We develop and analyze MARINA: a new communication efficient method for non-convex
distributed learning over heterogeneous datasets. MARINA employs a novel communication …

被引用次数：105 相关文章所有 12 个版本

高级搜索

QQ 群

Toward a theoretical foundation of policy optimization for learning control policies

Nonconvex optimization meets low-rank matrix factorization: An overview

Gradient starvation: A learning proclivity in neural networks

Implicit regularization in deep matrix factorization

Learning overparameterized neural networks via stochastic gradient descent on structured data

Lower bounds for non-convex stochastic optimization

Spectral methods for data science: A statistical perspective

Gradient descent maximizes the margin of homogeneous neural networks

[HTML][HTML] Entrywise eigenvector analysis of random matrices with low expected rank

MARINA: Faster non-convex distributed learning with compression

引用