Heterogeneous chips that combine CPUs and FPGAs can distribute processing so that the algorithm tasks are mapped onto the most suitable processing element. New software …
Many studies have focused on developing and improving auto-tuning algorithms for Nvidia Graphics Processing Units (GPUs), but the effectiveness and efficiency of these approaches …
S Perez, JM Etancelin, P Poncet - arXiv preprint arXiv:2409.05449, 2024 - arxiv.org
This article introduces a new efficient particle method for the numerical simulation of crystallization and precipitation at the pore scale of real rock geometries extracted by X-Ray …
Performance portability has rapidly become one of the key concerns for application developers targeting modern computer architectures. Although there are various …
Scientific applications need to be moved among supercomputers, such as Tianhe-2 and TSUBAME 2.5. OpenACC provides a directive-based approach for a single source code …
LF Manfroi, M Ferro, AM Yokoyama… - 2013 IEEE/ACM 6th …, 2013 - ieeexplore.ieee.org
Scientific computing often requires high performance and distributed computational resources to perform large scale experiments in order to achieve accurate results in due …
The eruption of multicore processors and several kinds of accelerators has generalized the interest in parallel programming. The OpenCL standard is very appealing because it …
Studying reactive flows in porous media is essential to manage the geochemical effects of CO2 capture and storage in natural underground reservoirs. Through homogenization of the …
In the light of the current race towards the Exascale, this article highlights the main features of the forthcoming computing elements that will be at the core of next generations of …