A multithreaded CUDA and OpenMP based power‐aware programming framework for multi‐node GPU systems

P Czarnul - Concurrency and Computation: Practice and …, 2023 - Wiley Online Library
In the article, we have proposed a framework that allows programming a parallel application
for a multi‐node system, with one or more graphical processing units (GPUs) per node …

[HTML][HTML] NGS: A network GPGPU system for orchestrating remote and virtual accelerators

J Prades, C Reaño, F Silla - Journal of Systems Architecture, 2024 - Elsevier
Abstract In General-Purpose computing on Graphics Processing Unit (GPGPU), the use of
CPUs is combined with that of GPUs. CPUs are used for sequential code, while GPUs are …

PoCL-R: A scalable low latency distributed OpenCL runtime

J Solanti, M Babej, J Ikkala… - … on Embedded Computer …, 2021 - Springer
Offloading the most demanding parts of applications to an edge GPU server cluster to save
power or improve the result quality is a solution that becomes increasingly realistic with new …

Hybrid-Smash: a heterogeneous CPU-GPU compression library

C Peñaranda, C Reaño, F Silla - IEEE Access, 2024 - ieeexplore.ieee.org
Compression algorithms are widely used to reduce data size and improve application
performance. Nevertheless, data compression has a computational cost which can limit its …

Hybrid-Smash: a heterogeneous CPU-GPU compression library

C Peñaranda-Cebrián, C Reaño, F Silla - IEEE Access, 2024 - riunet.upv.es
[EN] Compression algorithms are widely used to reduce data size and improve application
performance. Nevertheless, data compression has a computational cost which can limit its …

gVMP: A multi-objective joint VM and vGPU placement heuristic for API remoting-based GPU virtualization and disaggregation in cloud data centers

A Siavashi, M Momtazpour - Journal of Parallel and Distributed Computing, 2023 - Elsevier
The diverse needs of customers drive cloud providers to incorporate more GPU-enabled
services. It is known that users barely utilize GPUs. Hence, GPU virtualization techniques …

[HTML][HTML] Programming Optimization of Move with Interleaving Transform

B Žalik, A Jeromel, I Kolingerová, N Lukač, B Repnik - 2024 - intechopen.com
Computer resources and programming solutions have always been utilised with the aim of
improving efficiency; consequently, many optimisations have been developed in the past …

Enabling the CUDA Unified Memory model in Edge, Cloud and HPC offloaded GPU kernels

R Montella, D Di Luccio, CG De Vita… - 2022 22nd IEEE …, 2022 - ieeexplore.ieee.org
The use of hardware accelerators, based on code and data offloading devoted to
overcoming the CPU limitations in cores, is one of the main distinctive trends in high-end …

On Move with Interleaving (MwI) Implementation

B Žalik, A Jeromel, I Kolingerová, N Lukač, B Repnik - 2024 - preprints.org
Various implementations of the Move with Interleaving transform are discussed in this paper,
with the transformation itself explained briefly at first. The transform has an expected time …

GPU Acceleration in Unikernels Using Cricket GPU Virtualization

N Eiling, M Kröning, J Klimt, P Fensch… - Proceedings of the SC' …, 2023 - dl.acm.org
Today, large compute clusters increasingly move towards heterogeneous architectures by
employing accelerators, such as GPUs, to realize ever-increasing performance. To achieve …