Analysis-driven engineering of comparison-based sorting algorithms on GPUs

B Karsin, V Weichert, H Casanova, J Iacono… - Proceedings of the …, 2018 - dl.acm.org
We study the relationship between memory accesses, bank conflicts, thread multiplicity (also
known as over-subscription) and instruction-level parallelism in comparison-based sorting …

Accelerating the unacceleratable: Hybrid CPU/GPU algorithms for memory-bound database primitives

M Gowanlock, B Karsin, Z Fink, J Wright - Proceedings of the 15th …, 2019 - dl.acm.org
Many database operations have a low compute to memory access ratio. In heterogeneous
systems, where a graphics processing unit (GPU) is interconnected via PCIe, the data …

Engineering worst-case inputs for pairwise merge sort on GPUs

K Berney, N Sitchinava - 2020 IEEE International Parallel and …, 2020 - ieeexplore.ieee.org
Currently, the fastest comparison-based sorting implementation on GPUs is implemented
using a parallel pairwise merge sort algorithm (Thrust library). To achieve fast runtimes, the …

Parallel Cache-Efficient Algorithms on GPUs

KM Berney - 2023 - search.proquest.com
Abstract Graphics Processing Units (GPUs) have emerged as a highly attractive architecture
for general-purpose computing due to their numerous programmable cores, low-latency …

A performance model for GPU architectures: analysis and design of fundamental algorithms

B Karsin - 2018 - scholarspace.manoa.hawaii.edu
Over the past decade,\many-core" architectures have become a crucial resources for solving
com-putationally challenging problems. These systems rely on hundreds or thousands of …

An efficient multiway mergesort for GPU architectures

H Casanova, J Iacono, B Karsin, N Sitchinava… - arXiv preprint arXiv …, 2017 - arxiv.org
Sorting is a primitive operation that is a building block for countless algorithms. As such, it is
important to design sorting algorithms that approach peak performance on a range of …

A study of work distribution and contention in database primitives on heterogeneous CPU/GPU architectures

M Gowanlock, Z Fink, B Karsin, J Wright - Proceedings of the 36th …, 2021 - dl.acm.org
Graphics Processing Units (GPUs) provide very high on-card memory bandwidth which can
be exploited to address data-intensive workloads. To maximize algorithm throughput, it is …

[引用][C] Evaluation of Modern GPU Architecture Features to Design Efficient Algorithms

B Karsin - 2015