cuSZp: An Ultra-fast GPU Error-bounded Lossy Compression Framework with Optimized End-to-End Performance

Y Huang, S Di, X Yu, G Li, F Cappello - Proceedings of the International …, 2023 - dl.acm.org
Modern scientific applications and supercomputing systems are generating large amounts of
data in various fields, leading to critical challenges in data storage footprints and …

Bipart: a parallel and deterministic hypergraph partitioner

S Maleki, U Agarwal, M Burtscher… - Proceedings of the 26th …, 2021 - dl.acm.org
Hypergraph partitioning is used in many problem domains including VLSI design, linear
algebra, Boolean satisfiability, and data mining. Most versions of this problem are NP …

Scaling out speculative execution of finite-state machines with parallel merge

Y Xia, P Jiang, G Agrawal - Proceedings of the 25th ACM SIGPLAN …, 2020 - dl.acm.org
A finite-state machine (FSM) is a key component for many important applications, such as
Huffman decoding, regular expression matching and HTML tokenization. Due to its inherent …

An efficient parallel strategy for high-cost prefix operation

HM Bahig, KA Fathy - The Journal of Supercomputing, 2021 - Springer
The prefix computation strategy is a fundamental technique used to solve many problems in
computer science such as sorting, clustering, and computer vision. A large number of …

Enabling prefix sum parallelism pattern for recurrences with principled function reconstruction

Y Xia, P Jiang, G Agrawal - … of the 28th International Conference on …, 2019 - dl.acm.org
Much research work has been done to parallelize loops with recurrences over the last
several decades. Recently, sampling-and-reconstruction method was proposed to …

GPU efficient 1D and 3D recursive filtering

A Maximo - Digital Signal Processing, 2021 - Elsevier
This work presents strategies to massively parallelize recursive filters on inputs of one
dimension (1D) or three dimensions (3D), complementing and improving on previous state …

Parallel Tiled Code for Computing General Linear Recurrence Equations

W Bielecki, P Błaszyński - Electronics, 2021 - mdpi.com
In this article, we present a technique that allows us to generate parallel tiled code to
calculate general linear recursion equations (GLRE). That code deals with multidimensional …

Towards scalable hypergraph processing algorithms and their applications

S Maleki - 2022 - repositories.lib.utexas.edu
Graphs are a natural model for representing binary relations. However, it is difficult to use
graphs to capture non-binary relations such as communities of nodes. For example, graphs …

[图书][B] Solving Scaling Issues on a Single GPU

Y Xia - 2022 - search.proquest.com
There has been a significant amount of research works on accelerating irregular
applications with GPUs. However, we observed two significant scalability issues with GPU …

Błaszy nski, P. Parallel Tiled Code for Computing General Linear Recurrence Equations. Electronics 2021, 10, 2050

W Bielecki - 2021 - search.proquest.com
In this article, we present a technique that allows us to generate parallel tiled code to
calculate general linear recursion equations (GLRE). That code deals with multidimensional …