On the performance and energy efficiency of sparse linear algebra on GPUs

H Anzt, S Tomov, J Dongarra - The International Journal of …, 2017 - journals.sagepub.com
In this paper we unveil some performance and energy efficiency frontiers for sparse
computations on GPU-based supercomputers. We compare the resource efficiency of …

Systematic fusion of CUDA kernels for iterative sparse linear system solvers

JI Aliaga, J Pérez, ES Quintana-Ortí - … Vienna, Austria, August 24-28, 2015 …, 2015 - Springer
We introduce a systematic analysis in order to fuse CUDA kernels arising in efficient iterative
methods for the solution of sparse linear systems. Our procedure characterizes the input and …

Experiences in autotuning matrix multiplication for energy minimization on GPUs

H Anzt, B Haugen, J Kurzak… - Concurrency and …, 2015 - Wiley Online Library
In this paper, we report extensive results and analysis of autotuning the computationally
intensive graphics processing units kernel for dense matrix–matrix multiplication in double …

Energy efficiency and performance frontiers for sparse computations on GPU supercomputers

H Anzt, S Tomov, J Dongarra - … of the sixth international workshop on …, 2015 - dl.acm.org
In this paper we unveil some energy efficiency and performance frontiers for sparse
computations on GPU-based supercomputers. To do this, we consider state-of-the-art …

[HTML][HTML] Energy-efficient algorithms for ultrascale systems

J Carretero, S Distefano, D Petcu, D Pop… - … and Innovations: an …, 2015 - dl.acm.org
The chances to reach Exascale or Ultrascale Computing are strongly connected with the
problem of the energy consumption for processing applications. For physical and …

GLTO: On the adequacy of lightweight thread approaches for OpenMP implementations

A Castelló, S Seo, R Mayo, P Balaji… - 2017 46th …, 2017 - ieeexplore.ieee.org
OpenMP is the de facto standard application programming interface (API) for on-node
parallelism. The most popular OpenMP runtimes rely on POSIX threads (pthreads) …

Efficient target-response interpolation for a graphic equalizer

JA Belloch, V Välimäki - 2016 IEEE International Conference …, 2016 - ieeexplore.ieee.org
A graphic equalizer is an adjustable filter in which the command gain of each frequency
band is practically independent of the gains of other bands. Designing a graphic equalizer …

Study of the processor and memory power and energy consumption of coupled sparse/dense solvers

E Agullo, M Felšöci, A Guermouche… - 2022 IEEE 34th …, 2022 - ieeexplore.ieee.org
In the aeronautical industry, aeroacoustics is used to model the propagation of acoustic
waves in air flows enveloping an aircraft in flight. This for instance allows one to simulate the …

Modeling the effect of application-specific program transformations on energy and performance improvements of parallel ODE solvers

T Rauber, G Rünger - Journal of Computational Science, 2021 - Elsevier
Ordinary differential equations (ODEs) are important for modelling many problems from
science and engineering and efficient ODE solvers are required, for example when solving …

On the adequacy of lightweight thread approaches for high-level parallel programming models

A Castelló, R Mayo, K Sala, V Beltran, P Balaji… - Future Generation …, 2018 - Elsevier
High-level parallel programming models (PMs) are becoming crucial in order to extract the
computational power of current on-node multi-threaded parallelism. The most popular PMs …