Nonconvex optimization meets low-rank matrix factorization: An overview

Y Chi, YM Lu, Y Chen - IEEE Transactions on Signal …, 2019 - ieeexplore.ieee.org
Substantial progress has been made recently on developing provably accurate and efficient
algorithms for low-rank matrix factorization via nonconvex optimization. While conventional …

Optimization for deep learning: An overview

RY Sun - Journal of the Operations Research Society of China, 2020 - Springer
Optimization is a critical component in deep learning. We think optimization for neural
networks is an interesting topic for theoretical research due to various reasons. First, its …

Spider: Near-optimal non-convex optimization via stochastic path-integrated differential estimator

C Fang, CJ Li, Z Lin, T Zhang - Advances in neural …, 2018 - proceedings.neurips.cc
In this paper, we propose a new technique named\textit {Stochastic Path-Integrated
Differential EstimatoR}(SPIDER), which can be used to track many deterministic quantities of …

Adaptive methods for nonconvex optimization

M Zaheer, S Reddi, D Sachan… - Advances in neural …, 2018 - proceedings.neurips.cc
Adaptive gradient methods that rely on scaling gradients down by the square root of
exponential moving averages of past squared gradients, such RMSProp, Adam, Adadelta …

How to escape saddle points efficiently

C Jin, R Ge, P Netrapalli, SM Kakade… - … on machine learning, 2017 - proceedings.mlr.press
This paper shows that a perturbed form of gradient descent converges to a second-order
stationary point in a number iterations which depends only poly-logarithmically on …

On the optimization of deep networks: Implicit acceleration by overparameterization

S Arora, N Cohen, E Hazan - International conference on …, 2018 - proceedings.mlr.press
Conventional wisdom in deep learning states that increasing depth improves
expressiveness but complicates optimization. This paper suggests that, sometimes …

Adahessian: An adaptive second order optimizer for machine learning

Z Yao, A Gholami, S Shen, M Mustafa… - proceedings of the …, 2021 - ojs.aaai.org
Incorporating second-order curvature information into machine learning optimization
algorithms can be subtle, and doing so naïvely can lead to high per-iteration costs …

Theoretical insights into the optimization landscape of over-parameterized shallow neural networks

M Soltanolkotabi, A Javanmard… - IEEE Transactions on …, 2018 - ieeexplore.ieee.org
In this paper, we study the problem of learning a shallow artificial neural network that best
fits a training data set. We study this problem in the over-parameterized regime where the …

Optimal stochastic non-smooth non-convex optimization through online-to-non-convex conversion

A Cutkosky, H Mehta… - … Conference on Machine …, 2023 - proceedings.mlr.press
We present new algorithms for optimizing non-smooth, non-convex stochastic objectives
based on a novel analysis technique. This improves the current best-known complexity for …

No spurious local minima in nonconvex low rank problems: A unified geometric analysis

R Ge, C Jin, Y Zheng - International Conference on Machine …, 2017 - proceedings.mlr.press
In this paper we develop a new framework that captures the common landscape underlying
the common non-convex low-rank matrix problems including matrix sensing, matrix …