Benefits of cross memory attach for mpi libraries on hpc clusters

T Jain, G Cooperman - SC20: International Conference for High …, 2020 - ieeexplore.ieee.org

The share of the top 500 supercomputers with NVIDIA GPUs is now over 25% and continues
to grow. While fault tolerance is a critical issue for supercomputing, there does not currently …

被引用次数：24 相关文章所有 7 个版本

[PDF] souravc.com

Contention-aware kernel-assisted MPI collectives for multi-/many-core systems

S Chakraborty, H Subramoni… - 2017 IEEE International …, 2017 - ieeexplore.ieee.org

Multi-/many-core CPU based architectures are seeing widespread adoption due to their
unprecedented compute performance in a small power envelope. With the increasingly …

被引用次数：32 相关文章所有 4 个版本

[PDF] github.io

Designing efficient shared address space reduction collectives for multi-/many-cores

JM Hashmi, S Chakraborty, M Bayatpour… - 2018 IEEE …, 2018 - ieeexplore.ieee.org

State-of-the-art designs for the hierarchical reduction collective operation in MPI that work on
the concept of distributed address spaces incur the cost of intermediate copies inside the …

被引用次数：29 相关文章所有 3 个版本

[PDF] whiterose.ac.uk

Optimizing MPI Collectives on Shared Memory Multi-Cores

J Peng, J Fang, J Liu, M Xie, Y Dai, B Yang… - Proceedings of the …, 2023 - dl.acm.org

Message Passing Interface (MPI) programs often experience performance slowdowns due
to collective communication operations, like broadcasting and reductions. As modern CPUs …

被引用次数：2 相关文章所有 4 个版本

[PDF] archive.org

Gait analysis for human identification in frequency domain

S Yu, L Wang, W Hu, T Tan - … on Image and Graphics (ICIG'04), 2004 - ieeexplore.ieee.org

In this paper, we analyze the spatio-temporal human characteristic of moving silhouettes in
frequency domain, and find key Fourier descriptors that have better discriminatory capability …

被引用次数：75 相关文章所有 4 个版本

[PDF] nsf.gov

Scalable mpi collectives using sharp: Large scale performance evaluation on the tacc frontera system

B Ramesh, KK Suresh, N Sarkauskas… - 2020 Workshop on …, 2020 - ieeexplore.ieee.org

The Message-Passing Interface (MPI) is the de-facto standard for designing and executing
applications on massively parallel hardware. MPI collectives provide a convenient …

被引用次数：11 相关文章所有 3 个版本

[PDF] wiley.com

Optimizing point‐to‐point communication between adaptive MPI endpoints in shared memory

S White, LV Kale - Concurrency and Computation: Practice and …, 2020 - Wiley Online Library

Adaptive MPI is an implementation of the MPI standard that supports the virtualization of
ranks as user‐level threads, rather than OS processes. In this work, we optimize the …

被引用次数：13 相关文章所有 3 个版本

[PDF] github.io

CAB-MPI: Exploring interprocess work-stealing towards balanced MPI communication

K Ouyang, M Si, A Hori, Z Chen… - … Conference for High …, 2020 - ieeexplore.ieee.org

Load balance is essential for high-performance applications. Unbalanced communication
can cause severe performance degradation, even in computation-balanced BSP …

被引用次数：7 相关文章所有 7 个版本

[PDF] souravc.com

Cooperative rendezvous protocols for improved performance and overlap

S Chakraborty, M Bayatpour, J Hashmi… - … Conference for High …, 2018 - ieeexplore.ieee.org

With the emergence of larger multi-/many-core clusters and new areas of HPC applications,
performance of large message communication is becoming more important. MPI libraries …

被引用次数：9 相关文章所有 5 个版本

[PDF] miamioh.edu

Performance comparison of cross memory attach capable MPI vs. multithreaded optimistic parallel simulations

DM Rao - Proceedings of the 2018 ACM SIGSIM Conference on …, 2018 - dl.acm.org

The growth in many-core CPUs has motivated development of shared-memory,
multithreaded solutions to minimize communication and synchronization overheads in …

被引用次数：10 相关文章所有 2 个版本

高级搜索

QQ 群