Cupbop: Making cuda a portable language

R Han, J Chen, B Garg, X Zhou, J Lu, J Young… - ACM Transactions on …, 2024 - dl.acm.org
CUDA is designed specifically for NVIDIA GPUs and is not compatible with non-NVIDIA
devices. Enabling CUDA execution on alternative backends could greatly benefit the …

A comparison of two performance portability metrics

A Marowka - Concurrency and Computation: Practice and …, 2023 - Wiley Online Library
The rise in the demand for new performance portability frameworks for heterogeneous
computing systems has brought with it a number of proposals of workable metrics for …

Performance portable Vlasov code with C++ parallel algorithm

Y Asahi, T Padioleau, G Latu, J Bigot… - 2022 IEEE/ACM …, 2022 - ieeexplore.ieee.org
This paper presents the performance portable implementation of a kinetic plasma simulation
code with C++ parallel algorithm to run across multiple CPUs and GPUs. Relying on the …

Evaluating the performance portability of SYCL across CPUs and GPUs on bandwidth-bound applications

IZ Reguly - Proceedings of the SC'23 Workshops of The …, 2023 - dl.acm.org
In this paper, we evaluate the portability of the SYCL programming model on some of the
latest CPUs and GPUs from a wide range of vendors, utilizing the two main compilers …

[HTML][HTML] Enabling performance portability on the LiGen drug discovery pipeline

L Crisci, L Carpentieri, B Cosenza, G Accordi… - Future Generation …, 2024 - Elsevier
In recent years, there has been a growing interest in developing high-performance
implementations of drug discovery processing software. To target modern GPU …

[PDF][PDF] Taking GPU Programming Models to Task for Performance Portability

JH Davis, P Sivaraman, I Minn… - arXiv preprint arXiv …, 2024 - pssg.cs.umd.edu
Ensuring high productivity in scientific software development necessitates developing and
maintaining a single codebase that can run efficiently on a range of accelerator-based …

Performance portability of sparse block diagonal matrix multiple vector multiplications on gpus

KZ Ibrahim, C Yang, P Maris - 2022 IEEE/ACM International …, 2022 - ieeexplore.ieee.org
The emergence of accelerator-based computer architectures and programming models
makes it challenging to achieve performance portability for large-scale scientific simulation …

An Evaluative Comparison of Performance Portability across GPU Programming Models

JH Davis, P Sivaraman, I Minn, K Parasyris… - arXiv preprint arXiv …, 2024 - arxiv.org
Ensuring high productivity in scientific software development necessitates developing and
maintaining a single codebase that can run efficiently on a range of accelerator-based …

Exploring Scalability in C++ Parallel STL Implementations

R Laso, D Krupitza, S Hunold - … of the 53rd International Conference on …, 2024 - dl.acm.org
Since the advent of parallel algorithms in the C++ 17 Standard Template Library (STL), the
STL has become a viable framework for creating performance-portable applications. Given …

pSTL-Bench: A Micro-Benchmark Suite for Assessing Scalability of C++ Parallel STL Implementations

R Laso, D Krupitza, S Hunold - arXiv preprint arXiv:2402.06384, 2024 - arxiv.org
Since the advent of parallel algorithms in the C++ 17 Standard Template Library (STL), the
STL has become a viable framework for creating performance-portable applications. Given …