A comprehensive survey on coded distributed computing: Fundamentals, challenges, and networking applications

JS Ng, WYB Lim, NC Luong, Z Xiong… - … Surveys & Tutorials, 2021 - ieeexplore.ieee.org
Distributed computing has become a common approach for large-scale computation tasks
due to benefits such as high reliability, scalability, computation speed, and cost …

Lagrange coded computing: Optimal design for resiliency, security, and privacy

Q Yu, S Li, N Raviv, SMM Kalan… - The 22nd …, 2019 - proceedings.mlr.press
We consider a scenario involving computations over a massive dataset stored distributedly
across multiple workers, which is at the core of distributed learning algorithms. We propose …

Short-dot: Computing large linear transforms distributedly using coded short dot products

S Dutta, V Cadambe, P Grover - Advances In Neural …, 2016 - proceedings.neurips.cc
Faced with saturation of Moore's law and increasing size and dimension of data, system
designers have increasingly resorted to parallel and distributed computing to reduce …

On the optimal recovery threshold of coded matrix multiplication

S Dutta, M Fahim, F Haddadpour… - IEEE Transactions …, 2019 - ieeexplore.ieee.org
We provide novel coded computation strategies for distributed matrix-matrix products that
outperform the recent “Polynomial code” constructions in recovery threshold, ie, the required …

Coded computation over heterogeneous clusters

A Reisizadeh, S Prakash, R Pedarsani… - IEEE Transactions …, 2019 - ieeexplore.ieee.org
In large-scale distributed computing clusters, such as Amazon EC2, there are several types
of “system noise” that can result in major degradation of performance: system failures …

Coded sparse matrix multiplication

S Wang, J Liu, N Shroff - International Conference on …, 2018 - proceedings.mlr.press
In a large-scale and distributed matrix multiplication problem $ C= A^{\intercal} B $, where $
C\in\mathbb {R}^{r\times t} $, the coded computation plays an important role to effectively …

Coded computing: Mitigating fundamental bottlenecks in large-scale distributed computing and machine learning

S Li, S Avestimehr - Foundations and Trends® in …, 2020 - nowpublishers.com
We introduce the concept of “coded computing”, a novel computing paradigm that utilizes
coding theory to effectively inject and leverage data/computation redundancy to mitigate …

A unified coded deep neural network training strategy based on generalized polydot codes

S Dutta, Z Bai, H Jeong, TM Low… - 2018 IEEE International …, 2018 - ieeexplore.ieee.org
This paper has two main contributions. First, we propose a novel coding technique-
Generalized PolyDot-for matrix-vector products that advances on existing techniques for …

Cross subspace alignment codes for coded distributed batch computation

Z Jia, SA Jafar - IEEE Transactions on Information Theory, 2021 - ieeexplore.ieee.org
The goal of coded distributed computation is to efficiently distribute a computation task, such
as matrix multiplication, N-linear computation, or multivariate polynomial evaluation, across …

A hierarchical incentive design toward motivating participation in coded federated learning

JS Ng, WYB Lim, Z Xiong, X Cao… - IEEE Journal on …, 2021 - ieeexplore.ieee.org
Federated Learning (FL) is a privacy-preserving collaborative learning approach that trains
artificial intelligence (AI) models without revealing local datasets of the FL workers. While FL …