Fast variance reduction method with stochastic batch size

K Nakamura, B Derbel, KJ Won, BW Hong - Electronics, 2021 - mdpi.com

Deep neural networks (DNNs) have achieved great success in the last decades. DNN is
optimized using the stochastic gradient descent (SGD) with learning rate annealing that …

被引用次数：39 相关文章所有 4 个版本

[PDF] mlr.press

Direct acceleration of SAGA using sampled negative momentum

K Zhou, Q Ding, F Shang, J Cheng… - The 22nd …, 2019 - proceedings.mlr.press

Variance reduction is a simple and effective technique that accelerates convex (or non-
convex) stochastic optimization. Among existing variance reduction methods, SVRG and …

被引用次数：61 相关文章所有 5 个版本

[PDF] arxiv.org

A stochastic extra-step quasi-Newton method for nonsmooth nonconvex optimization

M Yang, A Milzarek, Z Wen, T Zhang - Mathematical Programming, 2022 - Springer

In this paper, a novel stochastic extra-step quasi-Newton method is developed to solve a
class of nonsmooth nonconvex composite optimization problems. We assume that the …

被引用次数：22 相关文章所有 9 个版本

[PDF] arxiv.org

Stochastic batch size for adaptive regularization in deep network optimization

K Nakamura, S Soatto, BW Hong - Pattern Recognition, 2022 - Elsevier

We propose a first-order stochastic optimization algorithm incorporating adaptive
regularization for pattern recognition problems in deep learning framework. The adaptive …

被引用次数：5 相关文章所有 6 个版本

Accelerating SGD using flexible variance reduction on large-scale datasets

M Tang, L Qiao, Z Huang, X Liu, Y Peng… - Neural Computing and …, 2020 - Springer

Stochastic gradient descent (SGD) is a popular optimization method widely used in machine
learning, while the variance of gradient estimation leads to slow convergence. To accelerate …

被引用次数：4 相关文章所有 4 个版本

[PDF] semanticscholar.org

[PDF][PDF] Learning-Rate Annealing Methods for Deep Neural Networks. Electronics 2021, 10, 2029

K Nakamura, B Derbel, KJ Won, BW Hong - 2021 - pdfs.semanticscholar.org

Deep neural networks (DNNs) have achieved great success in the last decades. DNN is
optimized using the stochastic gradient descent (SGD) with learning rate annealing that …

[PDF] github.io

[PDF][PDF] Accelerating Finite-sum Convex Optimization and Highly-smooth Convex Optimization

K ZHOU - 2019 - jnhujnhu.github.io

In the last half-century, a great amount of literature has been devoted to convex optimization.
If no specific problem structure is assumed other than convexity, very strong results are …

高级搜索

QQ 群