K Zhou, Q Ding, F Shang, J Cheng… - The 22nd …, 2019 - proceedings.mlr.press
Variance reduction is a simple and effective technique that accelerates convex (or non- convex) stochastic optimization. Among existing variance reduction methods, SVRG and …
In this paper, a novel stochastic extra-step quasi-Newton method is developed to solve a class of nonsmooth nonconvex composite optimization problems. We assume that the …
We propose a first-order stochastic optimization algorithm incorporating adaptive regularization for pattern recognition problems in deep learning framework. The adaptive …
M Tang, L Qiao, Z Huang, X Liu, Y Peng… - Neural Computing and …, 2020 - Springer
Stochastic gradient descent (SGD) is a popular optimization method widely used in machine learning, while the variance of gradient estimation leads to slow convergence. To accelerate …
K Nakamura, B Derbel, KJ Won, BW Hong - 2021 - pdfs.semanticscholar.org
Deep neural networks (DNNs) have achieved great success in the last decades. DNN is optimized using the stochastic gradient descent (SGD) with learning rate annealing that …
In the last half-century, a great amount of literature has been devoted to convex optimization. If no specific problem structure is assumed other than convexity, very strong results are …