Optimization techniques for GPU programming

P Hijma, S Heldens, A Sclocco… - ACM Computing …, 2023 - dl.acm.org
In the past decade, Graphics Processing Units have played an important role in the field of
high-performance computing and they still advance new fields such as IoT, autonomous …

A GPU-accelerated fast multipole method for GROMACS: performance and accuracy

B Kohnke, C Kutzner… - Journal of Chemical Theory …, 2020 - ACS Publications
An important and computationally demanding part of molecular dynamics simulations is the
calculation of long-range electrostatic interactions. Today, the prevalent method to compute …

PVFMM: A parallel kernel independent FMM for particle and volume potentials

D Malhotra, G Biros - Communications in Computational Physics, 2015 - cambridge.org
We describe our implementation of a parallel fast multipole method for evaluating potentials
for discrete and continuous source distributions. The first requires summation over the …

Computational physics on graphics processing units

A Harju, T Siro, FF Canova, S Hakala… - Applied Parallel and …, 2013 - Springer
The use of graphics processing units for scientific computations is an emerging strategy that
can significantly speed up various algorithms. In this review, we discuss advances made in …

An FMM based on dual tree traversal for many-core architectures

R Yokota - Journal of Algorithms & Computational …, 2013 - journals.sagepub.com
The present work attempts to integrate the independent efforts in the fast N-body community
to create the fastest N-body library for many-core and heterogenous architectures. Focus is …

Task‐based FMM for heterogeneous architectures

E Agullo, B Bramas, O Coulaud, E Darve… - Concurrency and …, 2016 - Wiley Online Library
High performance fast multipole method is crucial for the numerical simulation of many
physical problems. In a previous study, we have shown that task‐based fast multipole …

Generation of large finite‐element matrices on multiple graphics processors

A Dziekonski, P Sypek, A Lamecki… - … Journal for Numerical …, 2013 - Wiley Online Library
This paper presents techniques for generating very large finite‐element matrices on a
multicore workstation equipped with several graphics processing units (GPUs). To overcome …

ANKH: A Generalized O(N) Interpolated Ewald Strategy for Molecular Dynamics Simulations

I Chollet, L Lagardère, JP Piquemal - Journal of Chemical Theory …, 2023 - ACS Publications
To evaluate electrostatics interactions, molecular dynamics (MD) simulations rely on Particle
Mesh Ewald (PME), an O (N log (N)) algorithm that uses Fast Fourier Transforms (FFTs) or …

Algorithm 967: A distributed-memory fast multipole method for volume potentials

D Malhotra, G Biros - ACM Transactions on Mathematical Software …, 2016 - dl.acm.org
The solution of a constant-coefficient elliptic Partial Differential Equation (PDE) can be
computed using an integral transform: A convolution with the fundamental solution of the …

Fast multipole method as a matrix-free hierarchical low-rank approximation

R Yokota, H Ibeid, D Keyes - Eigenvalue Problems: Algorithms, Software …, 2017 - Springer
There has been a large increase in the amount of work on hierarchical low-rank
approximation methods, where the interest is shared by multiple communities that previously …