Adaptive and hierarchical large message all-to-all communication algorithms for large-scale dense gpu systems

KS Khorassani, CH Chu, QG Anthony… - 2021 IEEE/ACM 21st …, 2021 - ieeexplore.ieee.org
In recent years, GPU-enhanced clusters have become more prevalent in High-Performance
Computing (HPC), leading to a demand for more efficient multi-GPU communication. This …

Compiler Optimizations for Irregular Memory Access Patterns in the PGAS Programming Model

TB Rolinger - 2023 - search.proquest.com
Applications that operate on large, sparse graphs and matrices exhibit fine-grain irregular
memory accesses patterns, leading to both performance and productivity challenges on …

High-Performance, Adaptive, and Scalable GPU-aware MPI Libraries for Next-Generation Heterogeneous Systems

KS Khorassani - 2023 - search.proquest.com
Due to the emergence of various accelerators and interconnects over the years and their
adoption in upcoming exascale systems, it is pertinent to have scientific applications and …

[PDF][PDF] Machine learning library to support applications with embedded systems and parallel computing

C Miranda Meza - tesis.ipn.mx
The currently available machine learning libraries have strongly addressed deep learning
and parallel computing, but have neglected traditional machine learning methods and …