AI-enabling workloads on large-scale GPU-accelerated system: Characterization, opportunities, and implications

B Li, R Arora, S Samsi, T Patel… - … Symposium on High …, 2022 - ieeexplore.ieee.org
Production high-performance computing (HPC) systems are adopting and integrating GPUs
into their design to accommodate artificial intelligence (AI), machine learning, and data …

GPU computing and the road to extreme-scale parallel systems

SW Keckler - 2011 IEEE International Symposium on Workload …, 2011 - ieeexplore.ieee.org
While Moore's Law has continued to provide smaller semiconductor devices, the effective
end of uniprocessor performance scaling has (finally) instigated mainstream computing to …

Implementing and Optimizing a GPU-aware MPI Library for Intel GPUs: Early Experiences

CC Chen, KS Khorassani… - 2023 IEEE/ACM …, 2023 - ieeexplore.ieee.org
As the demand for computing power from High-Performance Computing (HPC) and Deep
Learning (DL) applications increase, there is a growing trend of equipping modern exascale …

Tools for gpu computing–debugging and performance analysis of heterogenous hpc applications

M Knobloch, B Mohr - Supercomputing Frontiers and Innovations, 2020 - superfri.org
General purpose GPUs are now ubiquitous in high-end supercomputing. All but one (the
Japanese Fugaku system, which is based on ARM processors) of the announced (pre-) …

Vcomputebench: A vulkan benchmark suite for gpgpu on mobile and embedded gpus

N Mammeri, B Juurlink - 2018 IEEE International Symposium …, 2018 - ieeexplore.ieee.org
GPUs have become immensely important computational units on embedded and mobile
devices. However, GPGPU developers are often not able to exploit the compute power …

Warp-consolidation: A novel execution model for gpus

A Li, W Liu, L Wang, K Barker, SL Song - Proceedings of the 2018 …, 2018 - dl.acm.org
With the unprecedented development of compute capability and extension of memory
bandwidth on modern GPUs, parallel communication and synchronization soon becomes a …

Need for speed: Experiences building a trustworthy system-level gpu simulator

O Villa, D Lustig, Z Yan, E Bolotin, Y Fu… - … Symposium on High …, 2021 - ieeexplore.ieee.org
The demands of high-performance computing (HPC) and machine learning (ML) workloads
have resulted in the rapid architectural evolution of GPUs over the last decade. The growing …

Partitioning gpus for improved scalability

J Janzén, D Black-Schaffer… - 2016 28th International …, 2016 - ieeexplore.ieee.org
To port applications to GPUs, developers need to express computational tasks as highly
parallel executions with tens of thousands of threads to fill the GPU's compute resources …

Look before you leap: Using the right hardware resources to accelerate applications

J Shen, AL Varbanescu, H Sips - 2014 IEEE Intl Conf on High …, 2014 - ieeexplore.ieee.org
GPUs are widely used to accelerate data-parallel applications. However, while the GPU
processing capability is enhanced in each generation, the CPU computing power is also …

Topology-aware GPU selection on multi-GPU nodes

I Faraji, SH Mirsadeghi, A Afsahi - 2016 IEEE International …, 2016 - ieeexplore.ieee.org
GPU accelerators have successfully established themselves in modern HPC clusters due to
their high performance and energy efficiency. To increase the GPU computational power in …