[PDF][PDF] Reining in the outliers in {Map-Reduce} clusters using mantri

G Ananthanarayanan, S Kandula… - … USENIX Symposium on …, 2010 - usenix.org
Experience from an operational Map-Reduce cluster reveals that outliers significantly
prolong job completion. e causes for outliers include run-time contention for processor …

Improving MapReduce performance using smart speculative execution strategy

Q Chen, C Liu, Z Xiao - IEEE Transactions on Computers, 2013 - ieeexplore.ieee.org
MapReduce is a widely used parallel computing framework for large scale data processing.
The two major performance metrics in MapReduce are job execution time and cluster …

Simulating MPI applications: the SMPI approach

A Degomme, A Legrand… - … on Parallel and …, 2017 - ieeexplore.ieee.org
This article summarizes our recent work and developments on SMPI, a flexible simulator of
MPI applications. In this tool, we took a particular care to ensure our simulator could be used …

MPI collective communications on the Blue Gene/P supercomputer: Algorithms and optimizations

A Faraj, S Kumar, B Smith, A Mamidala… - Proceedings of the 23rd …, 2009 - dl.acm.org
The IBM Blue Gene/P (BG/P) system is a massively parallel supercomputer succeeding
BG/L, and it is based on orders of magnitude in system size and significant power …

[图书][B] Fast Fourier transform algorithms for parallel computers

D Takahashi - 2019 - Springer
The fast Fourier transform (FFT) is an efficient implementation of the discrete Fourier
transform (DFT). The FFT is widely used in numerous applications in engineering, science …

Communication-sensitive static dataflow for parallel message passing applications

G Bronevetsky - 2009 International Symposium on Code …, 2009 - ieeexplore.ieee.org
Message passing is a very popular style of parallel programming, used in a wide variety of
applications and supported by many APIs, such as BSD sockets, MPI and PVM. Its …

Optimization principles for collective neighborhood communications

T Hoefler, T Schneider - SC'12: Proceedings of the …, 2012 - ieeexplore.ieee.org
Many scientific applications operate in a bulk-synchronous mode of iterative communication
and computation steps. Even though the communication steps happen at the same logical …

A study of process arrival patterns for MPI collective operations

A Faraj, P Patarasuk, X Yuan - Proceedings of the 21st annual …, 2007 - dl.acm.org
Process arrival pattern, which denotes the timing when different processes arrive at an MPI
collective operation, can have a significant impact on the performance of the operation. In …

Predicting MPI collective communication performance using machine learning

S Hunold, A Bhatele, G Bosilca… - 2020 IEEE International …, 2020 - ieeexplore.ieee.org
The Message Passing Interface (MPI) defines the semantics of data communication
operations, while the implementing libraries provide several parameterized algorithms for …

HAN: A hierarchical autotuned collective communication framework

X Luo, W Wu, G Bosilca, Y Pei, Q Cao… - 2020 IEEE …, 2020 - ieeexplore.ieee.org
High-performance computing (HPC) systems keep growing in scale and heterogeneity to
satisfy the increasing computational need, and this brings new challenges to the design of …