We consider a scenario involving computations over a massive dataset stored distributedly across multiple workers, which is at the core of distributed learning algorithms. We propose …
J Jiang, S Gan, Y Liu, F Wang, G Alonso… - Proceedings of the …, 2021 - dl.acm.org
The appeal of serverless (FaaS) has triggered a growing interest on how to use it in data- intensive applications such as ETL, query processing, or machine learning (ML). Several …
Faced with saturation of Moore's law and increasing size and dimension of data, system designers have increasingly resorted to parallel and distributed computing to reduce …
S Dutta, M Fahim, F Haddadpour… - IEEE Transactions …, 2019 - ieeexplore.ieee.org
We provide novel coded computation strategies for distributed matrix-matrix products that outperform the recent “Polynomial code” constructions in recovery threshold, ie, the required …
In large-scale distributed computing clusters, such as Amazon EC2, there are several types of “system noise” that can result in major degradation of performance: system failures …
S Prakash, S Dhakal, MR Akdeniz… - IEEE Journal on …, 2020 - ieeexplore.ieee.org
Federated learning enables training a global model from data located at the client nodes, without data sharing and moving client data to a centralized server. Performance of …
Gradient coding is a technique for straggler mitigation in distributed learning. In this paper we design novel gradient codes using tools from classical coding theory, namely, cyclic …
Due to the large size of the training data, distributed learning approaches such as federated learning have gained attention recently. However, the convergence rate of distributed …
Abstract Distributed Stochastic Gradient Descent (SGD) when run in a synchronous manner, suffers from delays in waiting for the slowest learners (stragglers). Asynchronous methods …