A comprehensive survey on coded distributed computing: Fundamentals, challenges, and networking applications

JS Ng, WYB Lim, NC Luong, Z Xiong… - … Surveys & Tutorials, 2021 - ieeexplore.ieee.org
Distributed computing has become a common approach for large-scale computation tasks
due to benefits such as high reliability, scalability, computation speed, and cost …

Cross subspace alignment codes for coded distributed batch computation

Z Jia, SA Jafar - IEEE Transactions on Information Theory, 2021 - ieeexplore.ieee.org
The goal of coded distributed computation is to efficiently distribute a computation task, such
as matrix multiplication, N-linear computation, or multivariate polynomial evaluation, across …

Coding for large-scale distributed machine learning

M Xiao, M Skoglund - Entropy, 2022 - mdpi.com
This article aims to give a comprehensive and rigorous review of the principles and recent
development of coding for large-scale distributed machine learning (DML). With increasing …

Wireless MapReduce distributed computing

F Li, J Chen, Z Wang - IEEE Transactions on Information …, 2019 - ieeexplore.ieee.org
Motivated by mobile edge computing and wireless data centers, we study a wireless
distributed computing framework where the distributed nodes exchange information over a …

GCSA codes with noise alignment for secure coded multi-party batch matrix multiplication

Z Chen, Z Jia, Z Wang, SA Jafar - IEEE Journal on Selected …, 2021 - ieeexplore.ieee.org
A secure multi-party batch matrix multiplication problem (SMBMM) is considered, where the
goal is to allow a master to efficiently compute the pairwise products of two batches of …

Coded distributed computing with partial recovery

E Ozfatura, S Ulukus, D Gündüz - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Coded computation techniques provide robustness against straggling workers in distributed
computing. However, most of the existing schemes require exact provisioning of the …

Straggler-aware distributed learning: Communication–computation latency trade-off

E Ozfatura, S Ulukus, D Gündüz - Entropy, 2020 - mdpi.com
When gradient descent (GD) is scaled to many parallel workers for large-scale machine
learning applications, its per-iteration computation time is limited by straggling workers …

Erasurehead: Distributed gradient descent without delays using approximate gradient coding

H Wang, Z Charles, D Papailiopoulos - arXiv preprint arXiv:1901.09671, 2019 - arxiv.org
We present ErasureHead, a new approach for distributed gradient descent (GD) that
mitigates system delays by employing approximate gradient coding. Gradient coded …

Distributed gradient descent with coded partial gradient computations

E Ozfatura, S Ulukus, D Gündüz - ICASSP 2019-2019 IEEE …, 2019 - ieeexplore.ieee.org
Coded computation techniques provide robustness against straggling servers in distributed
computing, with the following limitations: First, they increase decoding complexity. Second …

Storage-computation-communication tradeoff in distributed computing: Fundamental limits and complexity

Q Yan, S Yang, M Wigger - IEEE Transactions on Information …, 2022 - ieeexplore.ieee.org
Distributed computing has become one of the most important frameworks in dealing with
large computation tasks. In this paper, we propose a systematic construction of coded …