Emerging edge computing platforms often contain machine learning (ML) accelerators that can accelerate inference for a wide range of neural network (NN) models. These models are …
IB Peng, MB Gokhale, EW Green - Proceedings of the International …, 2019 - dl.acm.org
Byte-addressable non-volatile memory (NVM) features high density, DRAM comparable performance, and persistence. These characteristics position NVM as a promising new tier …
Energy consumption is one of the top challenges for achieving the next generation of supercomputing. Codesign of hardware and software is critical for improving energy …
For many decades, progress in computing hardware has been closely associated with CMOS logic density, performance, and cost. As such, slowdown in 2-D scaling, frequency …
G Ofenbeck, R Steinmann, V Caparros… - … Analysis of Systems …, 2014 - ieeexplore.ieee.org
The recently introduced roofline model plots the performance of executed code against its operational intensity (operations count divided by memory traffic). It also includes two …
YJ Lo, S Williams, B Van Straalen, TJ Ligocki… - … , and Simulation: 5th …, 2015 - Springer
We present preliminary results of the Roofline Toolkit for multicore, manycore, and accelerated architectures. This paper focuses on the processor architecture characterization …
S Beamer, K Asanović… - 2017 IEEE International …, 2017 - ieeexplore.ieee.org
Reducing communication is an important objective, as it can save energy or improve the performance of a communication-bound application. The graph algorithm PageRank …
M Burtscher, I Zecena, Z Zong - Proceedings of Workshop on General …, 2014 - dl.acm.org
GPU-accelerated programs are becoming increasingly common in HPC, personal computers, and even handheld devices, making it important to optimize their energy …