Large matrices arise in many machine learning and data analysis applications, including as representations of datasets, graphs, model weights, and first and second-order derivatives …
We study the problem of communication-efficient distributed vector mean estimation, which is a commonly used subroutine in distributed optimization and Federated Learning (FL) …
We generalize the leverage score sampling sketch for $\ell_2 $-subspace embeddings, to accommodate sampling subsets of the transformed data, so that the sketching approach is …
Gradient coding is a method for mitigating straggling servers in a centralized computing network that uses erasure-coding techniques to distributively carry out first-order …
Matrices and tensors are amongst the most common tools to represent and exploit information. Some sources produce large quantities of data, and analyzing the information …
L Grigori, E Timsit - arXiv preprint arXiv:2405.10923, 2024 - arxiv.org
This paper introduces a randomized Householder QR factorization (RHQR). This factorization can be used to obtain a well conditioned basis of a set of vectors and thus can …
With the advent of massive datasets, distributed techniques for processing information and carrying out computations are expected to enable exceptional possibilities for engineering …