This paper considers the problem of distributed optimization over time-varying graphs. For the case of undirected graphs, we introduce a distributed algorithm, referred to as DIGing …
Incremental gradient (IG) methods, such as stochastic gradient descent and its variants are commonly used for large scale optimization in machine learning. Despite the sustained effort …
X Jiang, SU Stich - Advances in Neural Information …, 2024 - proceedings.neurips.cc
The recently proposed stochastic Polyak stepsize (SPS) and stochastic line-search (SLS) for SGD have shown remarkable effectiveness when training over-parameterized models …
Inventors have long dreamed of creating machines that think. Ancient Greek myths tell of intelligent objects, such as animated statues of human beings and tables that arrive full of …
We analyze the stochastic average gradient (SAG) method for optimizing the sum of a finite number of smooth convex functions. Like stochastic gradient (SG) methods, the SAG …
Abstract Stochastic Gradient Descent (SGD) is a popular algorithm that can achieve state-of- the-art performance on a variety of machine learning tasks. Several researchers have …
This book, as the title suggests, is about first-order methods, namely, methods that exploit information on values and gradients/subgradients (but not Hessians) of the functions …
RH Byrd, SL Hansen, J Nocedal, Y Singer - SIAM Journal on Optimization, 2016 - SIAM
The question of how to incorporate curvature information into stochastic approximation methods is challenging. The direct application of classical quasi-Newton updating …
We propose a new stochastic gradient method for optimizing the sum of a finite set of smooth functions, where the sum is strongly convex. While standard stochastic gradient methods …