A survey of accelerating parallel sparse linear algebra

G Xiao, C Yin, T Zhou, X Li, Y Chen, K Li - ACM Computing Surveys, 2023 - dl.acm.org
Sparse linear algebra includes the fundamental and important operations in various large-
scale scientific computing and real-world applications. There exists performance bottleneck …

ESA: An efficient sequence alignment algorithm for biological database search on Sunway TaihuLight

H Zhang, Z Huang, Y Chen, J Liang, X Gao - Parallel Computing, 2023 - Elsevier
In computational biology, biological database search has been playing a very important role.
Since the COVID-19 outbreak, it has provided significant help in identifying common …

Redesign and Accelerate the AIREBO Bond-Order Potential on the New Sunway Supercomputer

P Gao, X Duan, B Schmidt, W Wan… - … on Parallel and …, 2023 - ieeexplore.ieee.org
Molecular dynamics (MD) is one of the most crucial computer simulation methods for
understanding real-world processes at the atomic level. Reactive potentials based on the …

A heterogeneous parallel computing approach optimizing SpTTM on CPU-GPU via GCN

H Wang, W Yang, R Ouyang, R Hu, K Li… - ACM Transactions on …, 2023 - dl.acm.org
Sparse Tensor-Times-Matrix (SpTTM) is the core calculation in tensor analysis. The sparse
distributions of different tensors vary greatly, which poses a big challenge to designing …

A load-balanced acceleration method for small and irregular batch matrix multiplication on GPU

Y Zhang, L Lu, Z Yang, Z Liang, S Suo - Journal of Systems Architecture, 2025 - Elsevier
As an essential mathematical operation, GEneral Matrix Multiplication (GEMM) plays a vital
role in many applications, such as high-performance computing, machine learning, etc. In …

Accelerating Electromagnetic Field Simulations Based on Memory-Optimized CPML-FDTD with OpenACC

D Padilla-Perez, I Medina-Sanchez, J Hernández… - Applied Sciences, 2022 - mdpi.com
Although GPUs can offer higher computing power at low power consumption, their low-level
programming can be relatively complex and consume programming time. For this reason …

Machine Learning-Based Kernel Selector for SpMV Optimization in Graph Analysis

G Xiao, T Zhou, Y Chen, Y Hu, K Li - ACM Transactions on Parallel …, 2024 - dl.acm.org
Sparse Matrix and Vector multiplication (SpMV) is one of the core algorithms in various large-
scale scientific computing and real-world applications. With the rapid development of AI and …

APPQ-CNN: An Adaptive CNNs Inference Accelerator for Synergistically Exploiting Pruning and Quantization Based on FPGA

X Zhang, G Xiao, M Duan, Y Chen… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Convolutional neural networks (CNNs) are widely utilized in intelligent edge computing
applications such as computational vision and image processing. However, as the number …

DTSpMV: An Adaptive SpMV Framework for Graph Analysis on GPUs

G Xiao, T Zhou, Y Chen, Y Hu… - 2022 IEEE 24th Int Conf on …, 2022 - ieeexplore.ieee.org
Sparse Matrix and Vector multiplication (SpMV) is one of the core algorithms in various large-
scale scientific computing and real-world applications. With the rapid development of AI and …