Data movement is the dominating factor affecting performance and energy in modern computing systems. Consequently, many algorithms have been developed to minimize the …
The celebrated Brascamp-Lieb (BL) inequalities [BL76, Lie90], and their reverse form of Barthe [Bar98], are an important mathematical tool, unifying and generalizing numerous in …
Advancements in the field of high-performance scientific computing are necessary to address the most important challenges we face in the 21st century. From physical modeling …
Matrix factorizations are among the most important building blocks of scientific computing. However, state-of-the-art libraries are not communication-optimal, underutilizing current …
We expose a systematic approach for developing distributed-memory parallel matrix-matrix multiplication algorithms. The journey starts with a description of how matrices are …
The matricized-tensor times Khatri-Rao product (MTTKRP) computation is the typical bottleneck in algorithms for computing a CP decomposition of a tensor. In order to develop …
J Dongarra, L Grigori… - … Transactions of the …, 2020 - royalsocietypublishing.org
A number of features of today's high-performance computers make it challenging to exploit these machines fully for computational science. These include increasing core counts but …
Determining I/O lower bounds is a crucial step in obtaining communication-efficient parallel algorithms, both across the memory hierarchy and between processors. Current approaches …
Communication, ie, moving data between levels of a memory hierarchy or between processors over a network, is much more expensive (in time or energy) than arithmetic …