Explicit data layout management for autotuning exploration on complex memory topologies

S Perarnau, B Videau, N Denoyelle… - 2019 IEEE/ACM …, 2019 - ieeexplore.ieee.org
2019 IEEE/ACM Workshop on Memory Centric High Performance …, 2019ieeexplore.ieee.org
The memory topology of high-performance computing platforms is becoming more complex.
Future exascale platforms in particular are expected to feature multiple types of memory
technologies, and multiple accelerator devices per compute node. In this paper, we discuss
the use of explicit management of the layout of data in memory across memory nodes and
devices for performance exploration purposes. Indeed, many classic optimization
techniques rely on reshaping or tiling input data in specific ways to achieve peak efficiency …
The memory topology of high-performance computing platforms is becoming more complex. Future exascale platforms in particular are expected to feature multiple types of memory technologies, and multiple accelerator devices per compute node. In this paper, we discuss the use of explicit management of the layout of data in memory across memory nodes and devices for performance exploration purposes. Indeed, many classic optimization techniques rely on reshaping or tiling input data in specific ways to achieve peak efficiency on a given architecture. With autotuning of a linear algebra code as the end goal, we present AML: a framework to treat three memory management abstractions as first-class citizens: data layout in memory, tiling of data for parallelism, and data movement across memory types. By providing access to these abstractions as part of the performance exploration design space, our framework eases the design and validation of complex, efficient algorithms for heterogeneous platforms. Using the Intel Knights Landing architecture in one of its most NUMA configurations as a proxy platform, we showcase our framework by exploring tiling and prefetching schemes for a DGEMM algorithm.
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果