Coding for large-scale distributed machine learning

M Xiao, M Skoglund - Entropy, 2022 - mdpi.com
This article aims to give a comprehensive and rigorous review of the principles and recent
development of coding for large-scale distributed machine learning (DML). With increasing …

An adaptive distributed source coding design for distributed learning

N Zhang, M Tao - 2021 13th International Conference on …, 2021 - ieeexplore.ieee.org
A major bottleneck in distributed learning is the communication overhead of exchanging
intermediate model update parameters between the worker nodes and the parameter …

Compressed coded distributed computing

AR Elkordy, S Li, MA Maddah-Ali… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Communication overhead is one of the major performance bottlenecks in large-scale
distributed computing systems, in particular for machine learning applications …

Two-Stage Coded Distributed Learning: A Dynamic Partial Gradient Coding Perspective

X Wang, X Zhong, J Ning, T Yang… - 2023 IEEE 43rd …, 2023 - ieeexplore.ieee.org
Distributed learning has been widely adopted to train a global model from local data.
However, its performance can be severely affected by stragglers. Recently, some research …

Tree gradient coding

A Reisizadeh, S Prakash, R Pedarsani… - 2019 IEEE …, 2019 - ieeexplore.ieee.org
Scaling up distributed machine learning systems face two major bottlenecks-delays due to
stragglers and limited communication bandwidth. Recently, a number of coding theoretic …

Speeding up distributed machine learning using codes

K Lee, M Lam, R Pedarsani… - IEEE Transactions …, 2017 - ieeexplore.ieee.org
Codes are widely used in many engineering applications to offer robustness against noise.
In large-scale systems, there are several types of noise that can affect the performance of …

Huffman coding based encoding techniques for fast distributed deep learning

RR Gajjala, S Banchhor, AM Abdelmoniem… - Proceedings of the 1st …, 2020 - dl.acm.org
Distributed stochastic algorithms, equipped with gradient compression techniques, such as
codebook quantization, are becoming increasingly popular and considered state-of-the-art …

Lightweight projective derivative codes for compressed asynchronous gradient descent

PJ Soto, I Ilmer, H Guan, J Li - International Conference on …, 2022 - proceedings.mlr.press
Coded distributed computation has become common practice for performing gradient
descent on large datasets to mitigate stragglers and other faults. This paper proposes a …

Optimal incentive and load design for distributed coded machine learning

N Ding, Z Fang, L Duan, J Huang - IEEE Journal on Selected …, 2021 - ieeexplore.ieee.org
A distributed machine learning platform needs to recruit many heterogeneous worker nodes
to finish computation simultaneously. As a result, the overall performance may be degraded …

Design and Optimization of Hierarchical Gradient Coding for Distributed Learning at Edge Devices

W Tang, J Li, L Chen, X Chen - arXiv preprint arXiv:2406.10831, 2024 - arxiv.org
Edge computing has recently emerged as a promising paradigm to boost the performance of
distributed learning by leveraging the distributed resources at edge nodes. Architecturally …