Iterative methods for sparse linear systems on graphics processing unit

AKC Ahamed, F Magoules - 2012 IEEE 14th International …, 2012 - ieeexplore.ieee.org
Many engineering and science problems require a computational effort to solve large sparse
linear systems. Krylov subspace based iterative solvers have been widely used in that …

Fast sparse matrix-vector multiplication on graphics processing unit for finite element analysis

AKC Ahamed, F Magoules - 2012 IEEE 14th International …, 2012 - ieeexplore.ieee.org
Finite element analysis involves the solution of linear systems described by large size
sparse matrices. Iterative Krylov methods are well suited for such type of problems. These …

Alinea: An advanced linear algebra library for massively parallel computations on graphics processing units

F Magoules, AKC Ahamed - The International Journal of …, 2015 - journals.sagepub.com
Direct and iterative methods are often used to solve linear systems in engineering. The
matrices involved can be large, which leads to heavy computations on the central …

Exploiting task and data parallelism in ILUPACK's preconditioned CG solver on NUMA architectures and many-core accelerators

JI Aliaga, RM Badia, M Barreda, M Bollhöfer… - Parallel Computing, 2016 - Elsevier
We present specialized implementations of the preconditioned iterative linear system solver
in ILUPACK for Non-Uniform Memory Access (NUMA) platforms and many-core hardware co …

Accelerating the task/data-parallel version of ILUPACK's BiCG in multi-CPU/GPU configurations

JI Aliaga, E Dufrechou, P Ezzatti, ES Quintana-Ortí - Parallel Computing, 2019 - Elsevier
ILUPACK is a valuable tool for the solution of sparse linear systems via iterative Krylov
subspace-based methods. Its relevance for the solution of real problems has motivated …

Assessing the impact of the CPU power-saving modes on the task-parallel solution of sparse linear systems

JI Aliaga, M Barreda, MF Dolz, AF Martín, R Mayo… - Cluster computing, 2014 - Springer
We investigate the benefits that an energy-aware implementation of the runtime in charge of
the concurrent execution of ILUPACK—a sophisticated preconditioned iterative solver for …

Auto-tuned Krylov methods on cluster of graphics processing unit

F Magoulès, AK Cheik Ahamed… - International Journal of …, 2015 - Taylor & Francis
Exascale computers are expected to have highly hierarchical architectures with nodes
composed by multiple core processors (CPU; central processing unit) and accelerators …

Optimized Schwarz method without overlap for the gravitational potential equation on cluster of graphics processing unit

F Magoulés, AK Cheik Ahamed… - International Journal of …, 2016 - Taylor & Francis
Many engineering and scientific problems need to solve boundary value problems for partial
differential equations or systems of them. For most cases, to obtain the solution with desired …

Distributed parallel bootstrap adaptive algebraic multigrid method

I Konshin, K Terekhov - Russian Supercomputing Days, 2022 - Springer
We propose a fully distributed parallel adaptive generalization of the Ruge–Stuben
algebraic multigrid method. In the adaptive multigrid framework, the coarse space and …

Exploiting thread-level parallelism in functional self-testing of CMT processors

A Apostolakis, M Psarakis, D Gizopoulos… - 2009 14th IEEE …, 2009 - ieeexplore.ieee.org
Major microprocessor vendors have integrated functional software-based self-testing in their
manufacturing test flows during the last decade. Functional self-testing is performed by test …