Pattern-based sparse matrix representation for memory-efficient SMVM kernels

MM Strout, M Hall, C Olschanowsky - Proceedings of the IEEE, 2018 - ieeexplore.ieee.org

Irregular applications such as big graph analysis, material simulations, molecular dynamics
simulations, and finite element analysis have performance problems due to their use of …

被引用次数：80 相关文章所有 3 个版本

[PDF] ethz.ch

Sparsep: Towards efficient sparse matrix vector multiplication on real processing-in-memory architectures

C Giannoula, I Fernandez, JG Luna, N Koziris… - Proceedings of the …, 2022 - dl.acm.org

Several manufacturers have already started to commercialize near-bank Processing-In-
Memory (PIM) architectures, after decades of research efforts. Near-bank PIM architectures …

被引用次数：43 相关文章所有 3 个版本

[PDF] arxiv.org

Towards efficient sparse matrix vector multiplication on real processing-in-memory architectures

C Giannoula, I Fernandez, J Gómez-Luna… - ACM SIGMETRICS …, 2022 - dl.acm.org

Several manufacturers have already started to commercialize near-bank Processing-In-
Memory (PIM) architectures, after decades of research efforts. Near-bank PIM architectures …

被引用次数：43 相关文章所有 10 个版本

[PDF] arxiv.org

Smash: Co-designing software compression and hardware-accelerated indexing for efficient sparse matrix operations

K Kanellopoulos, N Vijaykumar, C Giannoula… - Proceedings of the …, 2019 - dl.acm.org

Important workloads, such as machine learning and graph analytics applications, heavily
involve sparse linear algebra operations. These operations use sparse matrix compression …

被引用次数：96 相关文章所有 6 个版本

[PDF] researchgate.net

Evaluation criteria for sparse matrix storage formats

D Langr, P Tvrdik - IEEE Transactions on parallel and …, 2015 - ieeexplore.ieee.org

When authors present new storage formats for sparse matrices, they usually focus mainly on
a single evaluation criterion, which is the performance of sparse matrix-vector multiplication …

被引用次数：148 相关文章所有 5 个版本

Performance optimization using partitioned SpMV on GPUs and multicore CPUs

W Yang, K Li, Z Mo, K Li - IEEE Transactions on Computers, 2014 - ieeexplore.ieee.org

This paper presents a sparse matrix partitioning strategy to improve the performance of
SpMV on GPUs and multicore CPUs. This method has wide adaptability for different types of …

被引用次数：140 相关文章所有 7 个版本

[PDF] arxiv.org

Accelerating framework of transformer by hardware design and model compression co-optimization

P Qi, EHM Sha, Q Zhuge, H Peng… - 2021 IEEE/ACM …, 2021 - ieeexplore.ieee.org

State-of-the-art Transformer-based models, with gigantic parameters, are difficult to be
accommodated on resource constrained embedded devices. Moreover, with the …

被引用次数：37 相关文章所有 4 个版本

[PDF] escholarship.org

Reduced-bandwidth multithreaded algorithms for sparse matrix-vector multiplication

A Buluç, S Williams, L Oliker… - 2011 IEEE International …, 2011 - ieeexplore.ieee.org

On multicore architectures, the ratio of peak memory bandwidth to peak floating-point
performance (byte: flop ratio) is decreasing as core counts increase, further limiting the …

被引用次数：167 相关文章所有 13 个版本

[PDF] acm.org

Loop and data transformations for sparse matrix code

A Venkat, M Hall, M Strout - ACM SIGPLAN Notices, 2015 - dl.acm.org

This paper introduces three new compiler transformations for representing and transforming
sparse matrix computations and their data representations. In cooperation with run-time …

被引用次数：121 相关文章所有 7 个版本

[PDF] ntua.gr

CSX: an extended compression format for spmv on shared memory systems

K Kourtis, V Karakasis, G Goumas, N Koziris - ACM SIGPLAN Notices, 2011 - dl.acm.org

The Sparse Matrix-Vector multiplication (SpMV) kernel scales poorly on shared memory
systems with multiple processing units due to the streaming nature of its data access pattern …

被引用次数：124 相关文章所有 47 个版本

高级搜索

QQ 群