Analytical performance estimation during code generation on modern GPUs

D Ernst, M Holzer, G Hager, M Knorr… - Journal of Parallel and …, 2023 - Elsevier
Automatic code generation is frequently used to create implementations of algorithms
specifically tuned to particular hardware and application parameters. The code generation …

Using the semi-stencil algorithm to accelerate high-order stencils on GPUs

R Sai, J Mellor-Crummey, X Meng… - … and Simulation of …, 2021 - ieeexplore.ieee.org
Understanding how to develop efficient high-order stencils for Graphics Processing Units
(GPUs) is a topic of great interest for many application domains. High-performance stencils …

[PDF][PDF] John Mellor-Crummey

J Mellor-Crummey - 2024 - repository.rice.edu
To accelerate compute-intensive, memory-hungry scientific applications, the focus of chip
design has shifted from increasing transistor density to creating parallel architectures due to …