Accelerated methods for nonconvex optimization

Y Chi, YM Lu, Y Chen - IEEE Transactions on Signal …, 2019 - ieeexplore.ieee.org

Substantial progress has been made recently on developing provably accurate and efficient
algorithms for low-rank matrix factorization via nonconvex optimization. While conventional …

被引用次数：486 相关文章所有 13 个版本

[PDF] ncsu.edu

Optimization for deep learning: An overview

RY Sun - Journal of the Operations Research Society of China, 2020 - Springer

Optimization is a critical component in deep learning. We think optimization for neural
networks is an interesting topic for theoretical research due to various reasons. First, its …

被引用次数：146 相关文章所有 7 个版本

[PDF] neurips.cc

Spider: Near-optimal non-convex optimization via stochastic path-integrated differential estimator

C Fang, CJ Li, Z Lin, T Zhang - Advances in neural …, 2018 - proceedings.neurips.cc

In this paper, we propose a new technique named\textit {Stochastic Path-Integrated
Differential EstimatoR}(SPIDER), which can be used to track many deterministic quantities of …

被引用次数：629 相关文章所有 16 个版本

[PDF] neurips.cc

Adaptive methods for nonconvex optimization

M Zaheer, S Reddi, D Sachan… - Advances in neural …, 2018 - proceedings.neurips.cc

Adaptive gradient methods that rely on scaling gradients down by the square root of
exponential moving averages of past squared gradients, such RMSProp, Adam, Adadelta …

被引用次数：486 相关文章所有 7 个版本

[PDF] mlr.press

How to escape saddle points efficiently

C Jin, R Ge, P Netrapalli, SM Kakade… - … on machine learning, 2017 - proceedings.mlr.press

This paper shows that a perturbed form of gradient descent converges to a second-order
stationary point in a number iterations which depends only poly-logarithmically on …

被引用次数：988 相关文章所有 12 个版本

[PDF] mlr.press

On the optimization of deep networks: Implicit acceleration by overparameterization

S Arora, N Cohen, E Hazan - International conference on …, 2018 - proceedings.mlr.press

Conventional wisdom in deep learning states that increasing depth improves
expressiveness but complicates optimization. This paper suggests that, sometimes …

被引用次数：547 相关文章所有 13 个版本

[PDF] aaai.org

Adahessian: An adaptive second order optimizer for machine learning

Z Yao, A Gholami, S Shen, M Mustafa… - proceedings of the …, 2021 - ojs.aaai.org

Incorporating second-order curvature information into machine learning optimization
algorithms can be subtle, and doing so naïvely can lead to high per-iteration costs …

被引用次数：253 相关文章所有 8 个版本

[PDF] ieee.org

Theoretical insights into the optimization landscape of over-parameterized shallow neural networks

M Soltanolkotabi, A Javanmard… - IEEE Transactions on …, 2018 - ieeexplore.ieee.org

In this paper, we study the problem of learning a shallow artificial neural network that best
fits a training data set. We study this problem in the over-parameterized regime where the …

被引用次数：468 相关文章所有 10 个版本

[PDF] mlr.press

Optimal stochastic non-smooth non-convex optimization through online-to-non-convex conversion

A Cutkosky, H Mehta… - … Conference on Machine …, 2023 - proceedings.mlr.press

We present new algorithms for optimizing non-smooth, non-convex stochastic objectives
based on a novel analysis technique. This improves the current best-known complexity for …

被引用次数：34 相关文章所有 8 个版本

[PDF] mlr.press

No spurious local minima in nonconvex low rank problems: A unified geometric analysis

R Ge, C Jin, Y Zheng - International Conference on Machine …, 2017 - proceedings.mlr.press

In this paper we develop a new framework that captures the common landscape underlying
the common non-convex low-rank matrix problems including matrix sensing, matrix …

被引用次数：507 相关文章所有 7 个版本

高级搜索

QQ 群