Compilers and performance engineers use hardware performance models to simplify program optimizations. Performance models provide a necessary abstraction over complex …
Basic-block throughput models such as uiCA, IACA, GRANITE, Ithemal, llvm-mca, OSACA, or CQA guide optimizing compilers and help performance engineers identify and eliminate …
S Tan, Q Jiang, Z Cao, X Hao, J Chen, H An - CCF Transactions on High …, 2024 - Springer
The performance of high-performance computing (HPC) and other real-world applications is becoming unpredictable as the micro-architecture of the modern central processing unit …
Sparse computations, such as sparse matrix-dense vector multiplication, are notoriously hard to optimize due to their irregularity and memory-boundedness. Solutions to improve the …
In this paper we evaluate the efficacy of the Arm Scalable Vector Extension (SVE) instruction set for HPC workloads using a set of established mini-apps. Exploiting the vector capabilities …
Compiler optimization passes employ cost models to determine if a code transformation will yield performance improvements. When this assessment is inaccurate, compilers apply …
First, we present goSLP, a framework that uses integer linear programming to find a globally pairwise-optimal statement packing strategy to achieve superior vectorization performance …
A Poenaru - 2022 - research-information.bris.ac.uk
Recent generations of general-purpose central processing units (CPUs) for the high- performance segment have had to adopt new approaches in order to deliver increasing …
Energy optimization is an increasingly important aspect of today's high-performance computing applications. In particular, dynamic voltage and frequency scaling (DVFS) has …