Efficient process arrival pattern aware collective communication for deep learning

P Alizadeh, A Sojoodi, Y Hassan Temucin… - Proceedings of the 29th …, 2022 - dl.acm.org
MPI collective communication operations are used extensively in parallel applications. As
such, researchers have been investigating how to improve their performance and scalability …

Rethinking memory and communication cost for efficient large language model training

C Wu, H Zhang, L Ju, J Huang, Y Xiao, Z Huan… - arXiv preprint arXiv …, 2023 - arxiv.org
As model sizes and training datasets continue to increase, large-scale model training
frameworks reduce memory consumption by various sharding techniques. However, the …

Network states-aware collective communication optimization

J Wang, T Zhao, Y Wang - Cluster Computing, 2024 - Springer
Abstract Message Passing Interface (MPI) is the de facto standard for parallel programming,
and collective operations in MPI are widely utilized by numerous scientific applications. The …

MPI Collective Algorithm Selection in the Presence of Process Arrival Patterns

MS Beni, B Cosenza, S Hunold - 2024 IEEE International …, 2024 - ieeexplore.ieee.org
The Message Passing Interface (MPI) is a programming model for developing high-
performance applications on large-scale machines. A key component of MPI is its collective …

Investigation into MPI all-reduce performance in a distributed cluster with consideration of imbalanced process arrival patterns

J Proficz, P Sumionka, J Skomiał, M Semeniuk… - … : Proceedings of the 34th …, 2020 - Springer
The paper presents an evaluation of all-reduce collective MPI algorithms for an environment
based on a geographically-distributed compute cluster. The testbed was split into two sites …

Process arrival pattern aware algorithms for acceleration of scatter and gather operations

J Proficz - Cluster Computing, 2020 - Springer
Imbalanced process arrival patterns (PAPs) are ubiquitous in many parallel and distributed
systems, especially in HPC ones. The collective operations, eg in MPI, are designed for …

All-gather Algorithms Resilient to Imbalanced Process Arrival Patterns

J Proficz - ACM Transactions on Architecture and Code …, 2021 - dl.acm.org
Two novel algorithms for the all-gather operation resilient to imbalanced process arrival
patterns (PATs) are presented. The first one, Background Disseminated Ring (BDR), is …

Relaxing scalability limits with speculative parallelism in sequential Monte Carlo

B Nemeth, T Haber, J Liesenborgs… - … Conference on Cluster …, 2018 - ieeexplore.ieee.org
Sequential Monte Carlo methods are a useful tool to tackle non-linear problems in a
Bayesian setting. A target posterior distribution is approximated by moving a set of weighted …

Efficient Process Arrival Pattern Aware Collective Communication for HPC and Deep Learning

P Mohammadalizadehbakhtevari - 2021 - search.proquest.com
Abstract High-Performance Computing (HPC) is the key to tackle computationally intensive
problems such as Deep Learning (DL) and scientific applications. Message Passing …

Improving Clairvoyant: reduction algorithm resilient to imbalanced process arrival patterns

J Proficz, KM Ocetkiewicz - The Journal of Supercomputing, 2021 - Springer
The Clairvoyant algorithm proposed in “A novel MPI reduction algorithm resilient to
imbalances in process arrival times” was analyzed, commented and improved. The …