The landscape of parallel computing research: A view from berkeley K Asanovic, R Bodik, BC Catanzaro, JJ Gebis, P Husbands, K Keutzer, ... eScholarship, University of California 1, 1, 2006 | 3133 | 2006 |
Roofline: an insightful visual performance model for multicore architectures S Williams, A Waterman, D Patterson Communications of the ACM 52 (4), 65-76, 2009 | 3052 | 2009 |
Optimization of sparse matrix-vector multiplication on emerging multicore platforms S Williams, L Oliker, R Vuduc, J Shalf, K Yelick, J Demmel Proceedings of the 2007 ACM/IEEE Conference on Supercomputing, 1-12, 2007 | 1065 | 2007 |
Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures K Datta, M Murphy, V Volkov, S Williams, J Carter, L Oliker, D Patterson, ... SC'08: Proceedings of the 2008 ACM/IEEE conference on Supercomputing, 1-12, 2008 | 836 | 2008 |
The potential of the cell processor for scientific computing S Williams, J Shalf, L Oliker, S Kamil, P Husbands, K Yelick Proceedings of the 3rd Conference on Computing Frontiers, 9-20, 2006 | 492 | 2006 |
AMReX: a framework for block-structured adaptive mesh refinement W Zhang, A Almgren, V Beckner, J Bell, J Blaschke, C Chan, M Day, ... The Journal of Open Source Software 4 (37), 1370, 2019 | 370 | 2019 |
Optimization and performance modeling of stencil computations on modern microprocessors K Datta, S Kamil, S Williams, L Oliker, J Shalf, K Yelick SIAM review 51 (1), 129-159, 2009 | 321 | 2009 |
An auto-tuning framework for parallel multicore stencil computations S Kamil, C Chan, L Oliker, J Shalf, S Williams 2010 IEEE international symposium on parallel & distributed processing …, 2010 | 296 | 2010 |
Implicit and explicit optimizations for stencil computations S Kamil, K Datta, S Williams, L Oliker, J Shalf, K Yelick Proceedings of the 2006 workshop on Memory system performance and …, 2006 | 200 | 2006 |
An efficient multicore implementation of a novel HSS-structured multifrontal solver using randomized sampling P Ghysels, XS Li, FH Rouet, S Williams, A Napov SIAM Journal on Scientific Computing 38 (5), S358-S384, 2016 | 179 | 2016 |
Reduced-bandwidth multithreaded algorithms for sparse matrix-vector multiplication A Buluç, S Williams, L Oliker, J Demmel 2011 IEEE International Parallel & Distributed Processing Symposium, 721-733, 2011 | 169 | 2011 |
Lattice Boltzmann simulation optimization on leading multicore platforms S Williams, J Carter, L Oliker, J Shalf, K Yelick 2008 IEEE International Symposium on Parallel and Distributed Processing, 1-14, 2008 | 149 | 2008 |
Auto-tuning performance on multicore computers SW Williams University of California, Berkeley, 2008 | 144 | 2008 |
Roofline model toolkit: A practical tool for architectural and program analysis YJ Lo, S Williams, B Van Straalen, TJ Ligocki, MJ Cordery, NJ Wright, ... High Performance Computing Systems. Performance Modeling, Benchmarking, and …, 2015 | 139 | 2015 |
Exploiting multiple levels of parallelism in sparse matrix-matrix multiplication A Azad, G Ballard, A Buluc, J Demmel, L Grigori, O Schwartz, S Toledo, ... SIAM Journal on Scientific Computing 38 (6), C624-C651, 2016 | 137 | 2016 |
Scientific computing kernels on the cell processor S Williams, J Shalf, L Oliker, S Kamil, P Husbands, K Yelick International Journal of Parallel Programming 35 (3), 263-298, 2007 | 137 | 2007 |
Optimizing sparse matrix-multiple vectors multiplication for nuclear configuration interaction calculations HM Aktulga, A Buluç, S Williams, C Yang 2014 IEEE 28th International Parallel and Distributed Processing Symposium …, 2014 | 118 | 2014 |
Optimization of geometric multigrid for emerging multi-and manycore processors S Williams, DD Kalamkar, A Singh, AM Deshpande, B Van Straalen, ... SC'12: Proceedings of the International Conference on High Performance …, 2012 | 93 | 2012 |
Optimizing and tuning the fast multipole method for state-of-the-art multicore architectures A Chandramowlishwaran, S Williams, L Oliker, I Lashuk, G Biros, R Vuduc 2010 IEEE International Symposium on Parallel & Distributed Processing …, 2010 | 90 | 2010 |
Applying the roofline performance model to the intel xeon phi knights landing processor D Doerfler, J Deslippe, S Williams, L Oliker, B Cook, T Kurth, M Lobet, ... High Performance Computing: ISC High Performance 2016 International …, 2016 | 83 | 2016 |