Optimization techniques for GPU programming

P Hijma, S Heldens, A Sclocco… - ACM Computing …, 2023 - dl.acm.org
In the past decade, Graphics Processing Units have played an important role in the field of
high-performance computing and they still advance new fields such as IoT, autonomous …

A systematic survey of general sparse matrix-matrix multiplication

J Gao, W Ji, F Chang, S Han, B Wei, Z Liu… - ACM Computing …, 2023 - dl.acm.org
General Sparse Matrix-Matrix Multiplication (SpGEMM) has attracted much attention from
researchers in graph analyzing, scientific computing, and deep learning. Many optimization …

Accelerating sparse matrix–matrix multiplication with GPU Tensor Cores

O Zachariadis, N Satpute, J Gómez-Luna… - Computers & Electrical …, 2020 - Elsevier
Sparse general matrix–matrix multiplication (spGEMM) is an essential component in many
scientific and data analytics applications. However, the sparsity pattern of the input matrices …

TileSpGEMM: A tiled algorithm for parallel sparse general matrix-matrix multiplication on GPUs

Y Niu, Z Lu, H Ji, S Song, Z Jin, W Liu - Proceedings of the 27th ACM …, 2022 - dl.acm.org
Sparse general matrix-matrix multiplication (SpGEMM) is one of the most fundamental
building blocks in sparse linear solvers, graph processing frameworks and machine learning …

Performance-aware model for sparse matrix-matrix multiplication on the sunway taihulight supercomputer

Y Chen, K Li, W Yang, G Xiao, X Xie… - IEEE transactions on …, 2018 - ieeexplore.ieee.org
General sparse matrix-sparse matrix multiplication (SpGEMM) is one of the fundamental
linear operations in a wide variety of scientific applications. To implement efficient SpGEMM …

Dissecting tensor cores via microbenchmarks: Latency, throughput and numeric behaviors

W Sun, A Li, T Geng, S Stuijk… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Tensor Cores have been an important unit to accelerate Fused Matrix Multiplication
Accumulation (MMA) in all NVIDIA GPUs since Volta Architecture. To program Tensor Cores …

High-performance and memory-saving sparse general matrix-matrix multiplication for nvidia pascal gpu

Y Nagasaka, A Nukada… - 2017 46th International …, 2017 - ieeexplore.ieee.org
Sparse general matrix-matrix multiplication (SpGEMM) is one of the key kernels of
preconditioners such as algebraic multigrid method or graph algorithms. However, the …

Porting hypre to heterogeneous computer architectures: Strategies and experiences

RD Falgout, R Li, B Sjögreen, L Wang, UM Yang - Parallel Computing, 2021 - Elsevier
Linear systems are occurring in many applications, and solving them can take a large
amount of the total simulation time. The high performance library hypre provides a variety of …

High-performance sparse matrix-matrix products on Intel KNL and multicore architectures

Y Nagasaka, S Matsuoka, A Azad, A Buluç - Workshop Proceedings of …, 2018 - dl.acm.org
Sparse matrix-matrix multiplication (SpGEMM) is a computational primitive that is widely
used in areas ranging from traditional numerical applications to recent big data analysis and …

Performance optimization, modeling and analysis of sparse matrix-matrix products on multi-core and many-core processors

Y Nagasaka, S Matsuoka, A Azad, A Buluç - Parallel Computing, 2019 - Elsevier
Sparse matrix-matrix multiplication (SpGEMM) is a computational primitive that is widely
used in areas ranging from traditional numerical applications to recent big data analysis and …