Achieving high performance on supercomputers with a sequential task-based programming model

E Agullo, O Aumage, M Faverge… - … on Parallel and …, 2017 - ieeexplore.ieee.org
The emergence of accelerators as standard computing resources on supercomputers and
the subsequent architectural complexity increase revived the need for high-level parallel …

A fast time-domain boundary element method for three-dimensional electromagnetic scattering problems

T Takahashi - Journal of Computational Physics, 2023 - Elsevier
This paper proposes a fast time-domain boundary element method (TDBEM) to solve three-
dimensional transient electromagnetic scattering problems regarding perfectly electric …

Are static schedules so bad? a case study on cholesky factorization

E Agullo, O Beaumont… - 2016 IEEE …, 2016 - ieeexplore.ieee.org
Our goal is to provide an analysis and comparison of static and dynamic strategies for task
graph scheduling on platforms consisting of heterogeneous and unrelated resources, such …

[PDF][PDF] ExaFMM: a high-performance fast multipole method library with C++ and Python interfaces

T Wang, R Yokota, LA Barba - Journal of Open Source Software, 2021 - joss.theoj.org
ExaFMM is an open-source library for fast multipole algorithms, providing high-performance
evaluation of N-body problems in three dimensions, with C++ and Python interfaces. This …

Task‐based FMM for heterogeneous architectures

E Agullo, B Bramas, O Coulaud, E Darve… - Concurrency and …, 2016 - Wiley Online Library
High performance fast multipole method is crucial for the numerical simulation of many
physical problems. In a previous study, we have shown that task‐based fast multipole …

A kernel-independent treecode based on barycentric Lagrange interpolation

L Wang, R Krasny, S Tlupova - arXiv preprint arXiv:1902.02250, 2019 - arxiv.org
A kernel-independent treecode (KITC) is presented for fast summation of particle
interactions. The method employs barycentric Lagrange interpolation at Chebyshev points to …

A GPU-accelerated fast multipole method based on barycentric Lagrange interpolation and dual tree traversal

L Wilson, N Vaughn, R Krasny - Computer Physics Communications, 2021 - Elsevier
We present a GPU-accelerated fast multipole method (FMM) called BLDTT, which uses
barycentric Lagrange interpolation for the near-field and far-field approximations, and dual …

Application of the inverse fast multipole method as a preconditioner in a 3D Helmholtz boundary element method

T Takahashi, P Coulier, E Darve - Journal of Computational Physics, 2017 - Elsevier
We investigate an efficient preconditioning of iterative methods (such as GMRES) for solving
dense linear systems A x= b that follow from a boundary element method (BEM) for the 3D …

A new approach to latency insensitive design

MR Casu, L Macchiarulo - Proceedings of the 41st Annual Design …, 2004 - dl.acm.org
Latency Insensitive Protocols have been proposed as a viable mean to speed up large
Systems-on-Chip where the limit in clock frequency is given by long global wires connecting …

[PDF][PDF] On runtime systems for task-based programming on heterogeneous platforms

S Thibault - 2018 - inria.hal.science
SIMULATION has become pervasive in science. Real experimentation remains an essential
step in scientific research, but simulation replaced a wide range of costly and lengthy or …