Mpi collective communication operations on large shared memory systems

M Bernaschi, G Richelli - Proceedings Ninth Euromicro …, 2001 - ieeexplore.ieee.org
Collective communication performance is critical in a number of MPI applications yet
relatively few results are available to assess the performance of MPI implementations …

Accelerating MPI collective communications through hierarchical algorithms without sacrificing inter-node communication flexibility

BS Parsons, VS Pai - 2014 IEEE 28th International Parallel and …, 2014 - ieeexplore.ieee.org
This paper presents and evaluates a universal algorithm to improve the performance of MPI
collective communication operations on hierarchical clusters with many-core nodes. This …

Evaluating MPI collective communication on the SP2, T3D, and Paragon multicomputers

K Hwang, C Wang, CL Wang - Proceedings Third International …, 1997 - ieeexplore.ieee.org
We evaluate the architectural support of collective communication operations on the IBM
SP2, Cray T3D, and Intel Paragon. The MPI performance data are obtained from the STAP …

A decomposition approach for optimizing the performance of MPI libraries

O Hartmann, M Kunemann, T Rauber… - … 20th IEEE International …, 2006 - ieeexplore.ieee.org
MPI provides a portable message passing interface for many parallel execution platforms
but may lead to inefficiencies for some platforms and applications. In this article, we show …

Cartesian collective communication

JL Träff, S Hunold - Proceedings of the 48th International Conference on …, 2019 - dl.acm.org
We introduce Cartesian Collective Communication as sparse, collective communication
defined on processes (processors) organized into d-dimensional tori or meshes. Processes …

MPI collectives on modern multicore clusters: Performance optimizations and communication characteristics

AR Mamidala, R Kumar, D De… - 2008 Eighth IEEE …, 2008 - ieeexplore.ieee.org
The advances in multicore technology and modern interconnects is rapidly accelerating the
number of cores deployed in today's commodity clusters. A majority of parallel applications …

Adaptive recursive doubling algorithm for collective communication

O Arap, M Swany, G Brown… - 2015 IEEE International …, 2015 - ieeexplore.ieee.org
Process arrival times at MPI collective operations differ significantly. Addressing this fact with
special handling for popular collective communication algorithms can yield performance …

Pipelining and overlapping for MPI collective operations

J Worringen - 28th Annual IEEE International Conference on …, 2003 - ieeexplore.ieee.org
Collective operations are an important aspect of the currently most important message-
passing programming model MPI (message passing interface). Many MPI applications make …

A framework for hierarchical single-copy MPI collectives on multicore nodes

G Katevenis, M Ploumidis… - 2022 IEEE International …, 2022 - ieeexplore.ieee.org
Collective operations are widely used by MPI applications to realize their communication
patterns. Their efficiency is crucial for both performance and scalability of parallel …

Implementation and performance analysis of non-blocking collective operations for MPI

T Hoefler, A Lumsdaine, W Rehm - Proceedings of the 2007 ACM/IEEE …, 2007 - dl.acm.org
Collective operations and non-blocking point-to-point operations have always been part of
MPI. Although non-blocking collective operations are an obvious extension to MPI, there …