Recent advances in stochastic gradient descent in deep learning

Y Tian, Y Zhang, H Zhang - Mathematics, 2023 - mdpi.com
In the age of artificial intelligence, the best approach to handling huge amounts of data is a
tremendously motivating and hard problem. Among machine learning models, stochastic …

Problem formulations and solvers in linear SVM: a review

VK Chauhan, K Dahiya, A Sharma - Artificial Intelligence Review, 2019 - Springer
Support vector machine (SVM) is an optimal margin based classification technique in
machine learning. SVM is a binary linear classifier which has been extended to non-linear …

Federated learning of a mixture of global and local models

F Hanzely, P Richtárik - arXiv preprint arXiv:2002.05516, 2020 - arxiv.org
We propose a new optimization formulation for training federated learning models. The
standard formulation has the form of an empirical risk minimization problem constructed to …

Federated optimization: Distributed machine learning for on-device intelligence

J Konečný, HB McMahan, D Ramage… - arXiv preprint arXiv …, 2016 - arxiv.org
We introduce a new and increasingly relevant setting for distributed optimization in machine
learning, where the data defining the optimization are unevenly distributed over an …

随机梯度下降算法研究进展

史加荣, 王丹, 尚凡华, 张鹤于 - 自动化学报, 2021 - aas.net.cn
在机器学习领域中, 梯度下降算法是求解最优化问题最重要, 最基础的方法. 随着数据规模的不断
扩大, 传统的梯度下降算法已不能有效地解决大规模机器学习问题. 随机梯度下降算法在迭代 …

SARAH: A novel method for machine learning problems using stochastic recursive gradient

LM Nguyen, J Liu, K Scheinberg… - … conference on machine …, 2017 - proceedings.mlr.press
In this paper, we propose a StochAstic Recursive grAdient algoritHm (SARAH), as well as its
practical variant SARAH+, as a novel approach to the finite-sum minimization problems …

Coresets for data-efficient training of machine learning models

B Mirzasoleiman, J Bilmes… - … Conference on Machine …, 2020 - proceedings.mlr.press
Incremental gradient (IG) methods, such as stochastic gradient descent and its variants are
commonly used for large scale optimization in machine learning. Despite the sustained effort …

Only train once: A one-shot neural network training and pruning framework

T Chen, B Ji, T Ding, B Fang, G Wang… - Advances in …, 2021 - proceedings.neurips.cc
Structured pruning is a commonly used technique in deploying deep neural networks
(DNNs) onto resource-constrained devices. However, the existing pruning methods are …

Katyusha: The first direct acceleration of stochastic gradient methods

Z Allen-Zhu - Journal of Machine Learning Research, 2018 - jmlr.org
Nesterov's momentum trick is famously known for accelerating gradient descent, and has
been proven useful in building fast iterative algorithms. However, in the stochastic setting …

SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives

A Defazio, F Bach… - Advances in neural …, 2014 - proceedings.neurips.cc
In this work we introduce a new fast incremental gradient method SAGA, in the spirit of SAG,
SDCA, MISO and SVRG. SAGA improves on the theory behind SAG and SVRG, with better …