Selecting an appropriate workgroup size is critical for the performance of OpenCL kernels, and requires knowledge of the underlying hardware, the data being operated on, and the …
J Jiang, P Zhu - Journal of Applied Geophysics, 2018 - Elsevier
Full waveform inversion (FWI) is a challenging procedure due to the high computational cost related to the modeling, especially for the elastic case. The graphics processing unit (GPU) …
Stencil computations are a widely used type of algorithm, found in applications from physical simulations to machine learning. Stencils are embarrassingly parallel, therefore fit on …
High-performance distributed computing systems increasingly feature nodes that have multiple CPU sockets and multiple GPUs. The communication bandwidth between these …
F Dütsch, K Djelassi, M Haidl, S Gorlatch - Proceedings of the second …, 2014 - dl.acm.org
The development of programs for modern systems with GPUs and other accelerators is a complex and error-prone task. The popular GPU programming approaches like CUDA and …
Iterative stencil computations are widely used in numerical simulations. They present a high degree of parallelism, high locality and mostly-coalesced memory access patterns …
The software and hardware landscape of high performance computing is expanding faster than computational scientists can take advantage of new frameworks and platforms. In an …
The physical limitations of microprocessor design have forced the industry towards increasingly heterogeneous designs to extract performance. This trend has not been …
Modern programming languages provide programmers with rich abstractions for data collections as part of their standard libraries, eg, Containers in the C++ STL, the Java …