Evaluating Modern GPU Interconnect: PCIe, NVLink, NV-SLI, NVSwitch and GPUDirect A Li, SL Song, J Chen, J Li, X Liu, N Tallent, K Barker https://ieeexplore.ieee.org/document/8763922, 2019 | 251 | 2019 |
SMAT: An input adaptive auto-tuner for sparse matrix-vector multiplication J Li, G Tan, M Chen, N Sun Proceedings of the 34th ACM SIGPLAN conference on Programming language …, 2013 | 172 | 2013 |
FROSTT: The formidable repository of open sparse tensors and tools S Smith, JW Choi, J Li, R Vuduc, J Park, X Liu, G Karypis | 161 | 2017 |
Bridging the gap between deep learning and sparse matrix format selection Y Zhao, J Li, C Liao, X Shen Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of …, 2018 | 126 | 2018 |
HiCOO: Hierarchical storage of sparse tensors J Li, J Sun, R Vuduc SC18: International Conference for High Performance Computing, Networking …, 2018 | 113 | 2018 |
An input-adaptive and in-place approach to dense tensor-times-matrix multiply J Li, C Battaglino, I Perros, J Sun, R Vuduc Proceedings of the International Conference for High Performance Computing …, 2015 | 85 | 2015 |
Understanding the GPU microarchitecture to achieve bare-metal performance tuning X Zhang, G Tan, S Xue, J Li, K Zhou, M Chen Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of …, 2017 | 68 | 2017 |
Model-driven sparse CP decomposition for higher-order tensors J Li, J Choi, I Perros, J Sun, R Vuduc 2017 IEEE international parallel and distributed processing symposium (IPDPS …, 2017 | 67 | 2017 |
A pattern based algorithmic autotuner for graph processing on GPUs K Meng, J Li, G Tan, N Sun Proceedings of the 24th Symposium on Principles and Practice of Parallel …, 2019 | 55 | 2019 |
Load-Balanced Sparse MTTKRP on GPUs I Nisa, J Li, A Sukumaran-Rajam, R Vuduc, P Sadayappan https://ieeexplore.ieee.org/document/8821030, 2019 | 51 | 2019 |
Design and implementation of adaptive spmv library for multicore and many-core architecture G Tan, J Liu, J Li ACM Transactions on Mathematical Software (TOMS) 44 (4), 1-25, 2018 | 46 | 2018 |
Optimizing sparse tensor times matrix on multi-core and many-core architectures J Li, Y Ma, C Yan, R Vuduc 2016 6th Workshop on Irregular Applications: Architecture and Algorithms …, 2016 | 45 | 2016 |
Efficient and effective sparse tensor reordering J Li, B Uçar, ÜV Çatalyürek, J Sun, K Barker, R Vuduc Proceedings of the ACM International Conference on Supercomputing, 227-237, 2019 | 41 | 2019 |
An efficient mixed-mode representation of sparse tensors I Nisa, J Li, A Sukumaran-Rajam, PS Rawat, S Krishnamoorthy, ... Proceedings of the International Conference for High Performance Computing …, 2019 | 39 | 2019 |
Optimizing sparse tensor times matrix on GPUs Y Ma, J Li, X Wu, C Yan, J Sun, R Vuduc Journal of Parallel and Distributed Computing 129, 99-109, 2019 | 37 | 2019 |
ParTI!: A Parallel Tensor Infrastructure for Multicore CPU and GPUs J Li, Y Ma, V Richard Available from https://github.com/ hpcgarage/ParTI, 2018 | 28* | 2018 |
Sparta: High-performance, element-wise sparse tensor contraction on heterogeneous memory J Liu, J Ren, R Gioiosa, D Li, J Li Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of …, 2021 | 27 | 2021 |
PASTA: A Parallel Sparse Tensor Algorithm Benchmark Suite J Li, Y Ma, X Wu, A Li, K Barker https://link.springer.com/article/10.1007/s42514-019-00012-w 1 (2), 111–130, 2019 | 25 | 2019 |
An initial characterization of the Emu Chick E Hein, T Conte, J Young, S Eswar, J Li, P Lavin, R Vuduc, J Riedy 2018 IEEE International Parallel and Distributed Processing Symposium …, 2018 | 24 | 2018 |
Alphasparse: Generating high performance spmv codes directly from sparse matrices Z Du, J Li, Y Wang, X Li, G Tan, N Sun SC22: International Conference for High Performance Computing, Networking …, 2022 | 23 | 2022 |