Y Tian, Y Zhang, H Zhang - Mathematics, 2023 - mdpi.com
In the age of artificial intelligence, the best approach to handling huge amounts of data is a tremendously motivating and hard problem. Among machine learning models, stochastic …
Y Liu, Y Gao, W Yin - Advances in Neural Information …, 2020 - proceedings.neurips.cc
SGD with momentum (SGDM) has been widely applied in many machine learning tasks, and it is often applied with dynamic stepsizes and momentum weights tuned in a stagewise …
W Liu, L Chen, Y Chen, W Zhang - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
Federated learning (FL) provides a communication-efficient approach to solve machine learning problems concerning distributed data, without sending raw data to a central server …
H Yu, R Jin, S Yang - International Conference on Machine …, 2019 - proceedings.mlr.press
Recent developments on large-scale distributed machine learning applications, eg, deep neural networks, benefit enormously from the advances in distributed non-convex …
X Chen, S Liu, R Sun, M Hong - arXiv preprint arXiv:1808.02941, 2018 - arxiv.org
This paper studies a class of adaptive gradient based momentum algorithms that update the search directions and learning rates simultaneously using past gradients. This class, which …
Games generalize the single-objective optimization paradigm by introducing different objective functions for different players. Differentiable games often proceed by simultaneous …
N Loizou, P Richtárik - Computational Optimization and Applications, 2020 - Springer
In this paper we study several classes of stochastic optimization algorithms enriched with heavy ball momentum. Among the methods studied are: stochastic gradient descent …
R Xin, UA Khan - IEEE Transactions on Automatic Control, 2019 - ieeexplore.ieee.org
We study distributed optimization to minimize a sum of smooth and strongly-convex functions. Recent work on this problem uses gradient tracking to achieve linear convergence …
This is a handbook of simple proofs of the convergence of gradient and stochastic gradient descent type methods. We consider functions that are Lipschitz, smooth, convex, strongly …