Data-parallel hashing techniques for GPU architectures

B Lessley, H Childs - IEEE Transactions on Parallel and …, 2019 - ieeexplore.ieee.org
Hash tables are a fundamental data structure for effectively storing and accessing sparse
data, with widespread usage in domains ranging from computer graphics to machine …

Gunrock: GPU graph analytics

Y Wang, Y Pan, A Davidson, Y Wu, C Yang… - ACM Transactions on …, 2017 - dl.acm.org
For large-scale graph analytics on the GPU, the irregularity of data access and control flow,
and the complexity of programming GPUs, have presented two significant challenges to …

A dynamic hash table for the GPU

S Ashkiani, M Farach-Colton… - 2018 IEEE international …, 2018 - ieeexplore.ieee.org
We design and implement a fully concurrent dynamic hash table for GPUs with comparable
performance to the state of the art static hash tables. We propose a warp-cooperative work …

Warpcore: A library for fast hash tables on gpus

D Jünger, R Kobus, A Müller, C Hundt… - 2020 IEEE 27th …, 2020 - ieeexplore.ieee.org
Hash tables are ubiquitous. Properties such as an amortized constant time complexity for
insertion and querying as well as a compact memory layout make them versatile associative …

Parallel transposition of sparse data structures

H Wang, W Liu, K Hou, W Feng - … of the 2016 international conference on …, 2016 - dl.acm.org
Many applications in computational sciences and social sciences exploit sparsity and
connectivity of acquired data. Even though many parallel sparse primitives such as sparse …

Onesweep: a faster least significant digit radix sort for gpus

A Adinets, D Merrill - arXiv preprint arXiv:2206.01784, 2022 - arxiv.org
We present Onesweep, a least-significant digit (LSD) radix sorting algorithm for large GPU
sorting problems residing in global memory. Our parallel algorithm employs a method of …

GPU LSM: A dynamic dictionary data structure for the GPU

S Ashkiani, S Li, M Farach-Colton… - 2018 IEEE …, 2018 - ieeexplore.ieee.org
We develop a dynamic dictionary data structure for the GPU, supporting fast insertions and
deletions, based on the Log Structured Merge tree (LSM). Our implementation on an NVIDIA …

GPU Coroutines for Flexible Splitting and Scheduling of Rendering Tasks

S Zheng, X Chen, Z Shi, LQ Yan, K Xu - ACM Transactions on Graphics …, 2024 - dl.acm.org
We introduce coroutines into GPU kernel programming, providing an automated solution for
flexible splitting and scheduling of rendering tasks. This approach addresses a prevalent …

WarpDrive: Massively parallel hashing on multi-GPU nodes

D Jünger, C Hundt, B Schmidt - 2018 IEEE International …, 2018 - ieeexplore.ieee.org
Hash maps are among the most versatile data structures in computer science because of
their compact data layout and expected constant time complexity for insertion and querying …

A fast work-efficient sssp algorithm for gpus

K Wang, D Fussell, C Lin - Proceedings of the 26th ACM SIGPLAN …, 2021 - dl.acm.org
This paper presents a new Single Source Shortest Path (SSSP) algorithm for GPUs. Our key
advancement is an improved work scheduler, which is central to the performance of SSSP …