Parallel programming models for heterogeneous many-cores: a comprehensive survey

J Fang, C Huang, T Tang, Z Wang - CCF Transactions on High …, 2020 - Springer
Heterogeneous many-cores are now an integral part of modern computing systems ranging
from embedding systems to supercomputers. While heterogeneous many-core design offers …

Performance portability study of epistasis detection using sycl on nvidia gpu

Z Jin, JS Vetter - Proceedings of the 13th ACM International Conference …, 2022 - dl.acm.org
We describe the experience of converting a CUDA implementation of a high-order epistasis
detection algorithm to SYCL. The goals are for our work to be useful to application and …

Understanding performance portability of bioinformatics applications in sycl on an nvidia gpu

Z Jin, JS Vetter - 2022 IEEE International Conference on …, 2022 - ieeexplore.ieee.org
Our goal is to have a better understanding of performance portability of SYCL kernels on a
GPU. Toward this goal, we migrate representative kernels in bioinformatics applications from …

HeteroPP: A directive‐based heterogeneous cooperative parallel programming framework

L Wan, X Cui, Y Li, W Zheng… - … and Computation: Practice …, 2024 - Wiley Online Library
Heterogeneous platforms composed of multiple different types of computing devices (such
as CPUs, GPUs, and Intel MICs) have been widely used recently. However, most of parallel …

A dynamic multi–objective approach for dynamic load balancing in heterogeneous systems

A Cabrera, A Acosta, F Almeida… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
Modern standards in High Performance Computing (HPC) have started to consider energy
consumption and power draw as a limiting factor. New and more complex architectures have …

Multi-dimensional homomorphisms and their implementation in OpenCL

A Rasch, S Gorlatch - International Journal of Parallel Programming, 2018 - Springer
Homomorphisms (traditionally defined on lists) are functions that can be parallelized by the
divide-and-conquer paradigm. In this paper, we introduce an extension of the traditional …

Experience Deploying Graph Applications on GPUs with SYCL

Z Jin, JS Vetter - Proceedings of the 52nd International Conference on …, 2023 - dl.acm.org
SYCL allows for deployment and use of accelerators across vendors' platforms. In this work,
we describe the experience of deploying graph analytics on vendors' GPUs using SYCL. We …

A heuristic technique to improve energy efficiency with dynamic load balancing

A Cabrera, A Acosta, F Almeida, V Blanco - The Journal of …, 2019 - Springer
Heterogeneous computers require a well-distributed workload to operate efficiently. When
possible, this load balancing procedure should redistribute the workload with minimal …

Experience of Migrating a Parallel Graph Coloring Program from CUDA to SYCL

Z Jin - 2022 - osti.gov
We describe the experience of converting a CUDA implementation of a parallel graph
coloring algorithm to SYCL. The goals are for our work to be useful to application and …

High productivity multi-device exploitation with the Heterogeneous Programming Library

M Viñas, BB Fraguela, D Andrade, R Doallo - Journal of Parallel and …, 2017 - Elsevier
Heterogeneous devices require much more work from programmers than traditional CPUs,
particularly when there are several of them, as each one has its own memory space. Multi …