Optimizing CUDA code by kernel fusion: application on BLAS

J Filipovič, M Madzin, J Fousek, L Matyska - The Journal of …, 2015 - Springer
Contemporary GPUs have significantly higher arithmetic throughput than a memory
throughput. Hence, many GPU kernels are memory bound and cannot exploit arithmetic …

Finite element numerical integration for first order approximations on multi-and many-core architectures

K Banaś, F Krużel, J Bielański - Computer Methods in Applied Mechanics …, 2016 - Elsevier
The paper presents investigations on the performance of the finite element numerical
integration algorithm for first order approximations and three processor architectures …

OpenCL kernel fusion for GPU, Xeon Phi and CPU

J Filipovic, S Benkner - 2015 27th International Symposium on …, 2015 - ieeexplore.ieee.org
Kernel fusion is an optimization method, in which the code from several kernels is composed
to create a new, fused kernel. It can push the performance of kernels beyond limits given for …

Fused GEMMs towards an efficient GPU implementation of the ADER‐DG method in SeisSol

R Dorozhinskii, GB Gadeschi… - … : Practice and Experience, 2024 - Wiley Online Library
This study shows how GPU performance of the ADER discontinuous Galerkin method in
SeisSol (an earthquake simulation software) can be further improved while preserving its …

Adapting and Optimizing High Order Seismic Simulations for GPU-based Supercomputers

R Dorozhinskii - 2024 - mediatum.ub.tum.de
The goal of this study is to adapt and optimize an open-source, highly tuned CPU-based
scientific application designed for simulating seismic wave propagation and earthquake …

Application of middleware technique in Web of flood forecasting system with multiple models

H Mincong, X Jiancang, C Yang, W Ni… - … Conference on Hybrid …, 2006 - ieeexplore.ieee.org
According to the J2EE criteria, a framework based on the middleware technique and model-
view controller (MVC) design pattern are introduced to establish a flood control system in a …

[PDF][PDF] Software Performance Optimization in Scientific Computing

J Filipovic - is.muni.cz
Scientific computing is considered to be the third mode of science, complementing
experiments and theory. It uses simulation and modeling to understand complex problems …

Finite element numerical integration for first order approximations on multi-core architectures

K Banaś, F Krużel, J Bielański - arXiv preprint arXiv:1504.01023, 2015 - arxiv.org
The paper presents investigations on the implementation and performance of the finite
element numerical integration algorithm for first order approximations and three processor …

[PDF][PDF] Lazy evaluation method in the component environments

M Kniotek - Advances in Computer Science Research, 2013 - bibliotekanauki.pl
This paper describes the manually use of the lazy evaluation code optimization method in
the component environments such as Java VM, MS. NET, Mono. Despite the implemented …