A survey of accelerating parallel sparse linear algebra

G Xiao, C Yin, T Zhou, X Li, Y Chen, K Li - ACM Computing Surveys, 2023 - dl.acm.org
Sparse linear algebra includes the fundamental and important operations in various large-
scale scientific computing and real-world applications. There exists performance bottleneck …

A high-performance sparse tensor algebra compiler in multi-level IR

R Tian, L Guo, J Li, B Ren, G Kestor - arXiv preprint arXiv:2102.05187, 2021 - arxiv.org
Tensor algebra is widely used in many applications, such as scientific computing, machine
learning, and data analytics. The tensors represented real-world data are usually large and …

fgSpMSpV: A fine-grained parallel SpMSpV framework on HPC platforms

Y Chen, G Xiao, K Li, F Piccialli… - ACM Transactions on …, 2022 - dl.acm.org
Sparse matrix-sparse vector (SpMSpV) multiplication is one of the fundamental and
important operations in many high-performance scientific and engineering applications. The …

[HTML][HTML] A malware propagation prediction model based on representation learning and graph convolutional networks

T Li, Y Liu, Q Liu, W Xu, Y Xiao, H Liu - Digital Communications and …, 2023 - Elsevier
The traditional malware research is mainly based on its recognition and detection as a
breakthrough point, without focusing on its propagation trends or predicting the …

Redesign and Accelerate the AIREBO Bond-Order Potential on the New Sunway Supercomputer

P Gao, X Duan, B Schmidt, W Wan… - … on Parallel and …, 2023 - ieeexplore.ieee.org
Molecular dynamics (MD) is one of the most crucial computer simulation methods for
understanding real-world processes at the atomic level. Reactive potentials based on the …

Efficient parallel secure outsourcing of modular exponentiation to cloud for IoT applications

Q Hu, M Duan, Z Yang, S Yu… - IEEE Internet of Things …, 2020 - ieeexplore.ieee.org
Modular exponentiation, an operation widely utilized in cryptographic protocols to transfer
text and other forms of data, can also be applied to Internet-of-Things (IoT) devices with high …

Low-complex resource mapping heuristics for mobile and iot workloads on NoC-HMPSoC architecture

B Gomatheeshwari, K Gopi, A Mathias - Microprocessors and …, 2023 - Elsevier
Network-on-chip-based heterogeneous multiprocessor system-on-a chip (NoC-HMPSoC) a
single board computer is extensively utilized in many real-time applications such as mobile …

A heterogeneous parallel computing approach optimizing SpTTM on CPU-GPU via GCN

H Wang, W Yang, R Ouyang, R Hu, K Li… - ACM Transactions on …, 2023 - dl.acm.org
Sparse Tensor-Times-Matrix (SpTTM) is the core calculation in tensor analysis. The sparse
distributions of different tensors vary greatly, which poses a big challenge to designing …

Exploiting hierarchical parallelism and reusability in tensor kernel processing on heterogeneous HPC systems

Y Chen, G Xiao, MT Özsu, Z Tang… - 2022 IEEE 38th …, 2022 - ieeexplore.ieee.org
Canonical Polyadic Decomposition (CPD) of sparse tensors is an effective tool in various
machine learning and data analytics applications, in which sparse Matricized Tensor Times …

Analysis and optimization of dual parallel partition sorting with openmp

S Ketchaya, A Rattanatranurak - Applied Computing and Informatics, 2022 - emerald.com
Purpose Sorting is a very important algorithm to solve problems in computer science. The
most well-known divide and conquer sorting algorithm is quicksort. It starts with dividing the …