[HTML][HTML] Distributed estimation over a low-cost sensor network: A review of state-of-the-art

S He, HS Shin, S Xu, A Tsourdos - Information Fusion, 2020 - Elsevier
Proliferation of low-cost, lightweight, and power efficient sensors and advances in networked
systems enable the employment of multiple sensors. Distributed estimation provides a …

Communication-efficient distributed deep learning: A comprehensive survey

Z Tang, S Shi, W Wang, B Li, X Chu - arXiv preprint arXiv:2003.06307, 2020 - arxiv.org
Distributed deep learning (DL) has become prevalent in recent years to reduce training time
by leveraging multiple computing devices (eg, GPUs/TPUs) due to larger models and …

Fully decentralized multi-agent reinforcement learning with networked agents

K Zhang, Z Yang, H Liu, T Zhang… - … conference on machine …, 2018 - proceedings.mlr.press
We consider the fully decentralized multi-agent reinforcement learning (MARL) problem,
where the agents are connected via a time-varying and possibly sparse communication …

Can decentralized algorithms outperform centralized algorithms? a case study for decentralized parallel stochastic gradient descent

X Lian, C Zhang, H Zhang, CJ Hsieh… - Advances in neural …, 2017 - proceedings.neurips.cc
Most distributed machine learning systems nowadays, including TensorFlow and CNTK, are
built in a centralized fashion. One bottleneck of centralized algorithms lies on high …

Network topology and communication-computation tradeoffs in decentralized optimization

A Nedić, A Olshevsky, MG Rabbat - Proceedings of the IEEE, 2018 - ieeexplore.ieee.org
In decentralized optimization, nodes cooperate to minimize an overall objective function that
is the sum (or average) of per-node private objective functions. Algorithms interleave local …

Asynchronous decentralized parallel stochastic gradient descent

X Lian, W Zhang, C Zhang, J Liu - … Conference on Machine …, 2018 - proceedings.mlr.press
Most commonly used distributed machine learning systems are either synchronous or
centralized asynchronous. Synchronous algorithms like AllReduce-SGD perform poorly in a …

Achieving geometric convergence for distributed optimization over time-varying graphs

A Nedic, A Olshevsky, W Shi - SIAM Journal on Optimization, 2017 - SIAM
This paper considers the problem of distributed optimization over time-varying graphs. For
the case of undirected graphs, we introduce a distributed algorithm, referred to as DIGing …

Harnessing smoothness to accelerate distributed optimization

G Qu, N Li - IEEE Transactions on Control of Network Systems, 2017 - ieeexplore.ieee.org
There has been a growing effort in studying the distributed optimization problem over a
network. The objective is to optimize a global function formed by a sum of local functions …

: Decentralized Training over Decentralized Data

H Tang, X Lian, M Yan, C Zhang… - … Conference on Machine …, 2018 - proceedings.mlr.press
While training a machine learning model using multiple workers, each of which collects data
from its own data source, it would be useful when the data collected from different workers …

Minibatch vs local sgd for heterogeneous distributed learning

BE Woodworth, KK Patel… - Advances in Neural …, 2020 - proceedings.neurips.cc
We analyze Local SGD (aka parallel or federated SGD) and Minibatch SGD in the
heterogeneous distributed setting, where each machine has access to stochastic gradient …