Distributed optimization with arbitrary local solvers

C Ma, J Konečný, M Jaggi, V Smith… - optimization Methods …, 2017 - Taylor & Francis
With the growth of data and necessity for distributed optimization methods, solvers that work
well on a single machine must be re-designed to leverage distributed computation. Recent …

From server-based to client-based machine learning: A comprehensive survey

R Gu, C Niu, F Wu, G Chen, C Hu, C Lyu… - ACM Computing Surveys …, 2021 - dl.acm.org
In recent years, mobile devices have gained increasing development with stronger
computation capability and larger storage space. Some of the computation-intensive …

Stochastic, distributed and federated optimization for machine learning

J Konečný - arXiv preprint arXiv:1707.01155, 2017 - arxiv.org
We study optimization algorithms for the finite sum problems frequently arising in machine
learning applications. First, we propose novel variants of stochastic gradient descent with a …

Efficient distributed hessian free algorithm for large-scale empirical risk minimization via accumulating sample strategy

M Jahani, X He, C Ma, A Mokhtari… - International …, 2020 - proceedings.mlr.press
In this paper, we propose a Distributed Accumulated Newton Conjugate gradiEnt (DANCE)
method in which sample size is gradually increasing to quickly obtain a solution whose …

Preconditioned conjugate gradient methods in truncated newton frameworks for large-scale linear classification

CY Hsia, WL Chiang, CJ Lin - Asian Conference on Machine …, 2018 - proceedings.mlr.press
Truncated Newton method is one of the most effective optimization methods for large-scale
linear classification. The main computational task at each Newton iteration is to …

A class of parallel doubly stochastic algorithms for large-scale learning

A Mokhtari, A Koppel, M Takac, A Ribeiro - Journal of machine learning …, 2020 - jmlr.org
We consider learning problems over training sets in which both, the number of training
examples and the dimension of the feature vectors, are large. To solve these problems we …

Communication lower bounds for distributed convex optimization: Partition data on features

Z Chen, L Luo, Z Zhang - Proceedings of the AAAI Conference on …, 2017 - ojs.aaai.org
Recently, there has been an increasing interest in designing distributed convex optimization
algorithms under the setting where the data matrix is partitioned on features. Algorithms …

[PDF][PDF] Grow Your Samples and Optimize Better via Distributed Newton CG and Accumulating Strategy

M Jahani, X He, C Ma, A Mokhtari, D Mudigere… - engineering.lehigh.edu
In this work1, we propose a Distributed Accumulated Newton Conjugate gradiEnt (DANCE)
method in which sample size is gradually increasing to quickly obtain a solution whose …

[PDF][PDF] Distributed Algorithms in Large-scaled Empirical Risk Minimization: Non-convexity, Adaptive Sampling, and Matrix-free Second-order Methods

X He - 2019 - core.ac.uk
Abstract 1 1 Dual Free Adaptive Mini-batch SDCA for Empirical Risk Minimization 3 1.1
Introduction..................................... 3 1.1. 1 Contributions................................ 5 1.1. 2 …

Distributed Restarting NewtonCG Method for Large-Scale Empirical Risk Minimization

M Jahani, X He, C Ma, D Mudigere, A Mokhtari… - 2017 - openreview.net
In this paper, we propose a distributed damped Newton method in which sample size is
gradually increasing to quickly obtain a solution whose empirical loss is under satisfactory …