[图书][B] The compiler design handbook: optimizations and machine code generation

YN Srikant, P Shankar - 2002 - taylorfrancis.com
The widespread use of object-oriented languages and Internet security concerns are just the
beginning. Add embedded systems, multiple memory banks, highly pipelined units …

Automatic data layout for high performance fortran

K Kennedy, U Kremer - Proceedings of the 1995 ACM/IEEE conference …, 1995 - dl.acm.org
High Performance Fortran (HPF) is rapidly gaining acceptance as a language for parallel
programming. The goal of HPF is to provide a simple yet efficient machine independent …

Automatic data layout for distributed-memory machines

K Kennedy, U Kremer - ACM Transactions on Programming Languages …, 1998 - dl.acm.org
The goal of languages like Fortran D or High Performance Fortran (HPF) is to provide a
simple yet efficient machine-independent parallel programming model. After the algorithm …

A linear algebra framework for automatic determination of optimal data layouts

M Kandemir, A Choudhary, N Shenoy… - … on Parallel and …, 1999 - ieeexplore.ieee.org
This paper presents a data layout optimization technique for sequential and parallel
programs based on the theory of hyperplanes from linear algebra. Given a program, our …

[图书][B] Optimization within a unified transformation framework

WA Kelly - 1996 - search.proquest.com
Programmers typically want to write scientific programs in a high level language with
semantics based on a sequential execution model. To execute efficiently on a parallel …

[图书][B] Automatic data layout for distributed memory machines

UJ Kremer - 1996 - search.proquest.com
The goal of languages like Fortran D or High Performance Fortran (HPF) is to provide a
simple yet efficient machine-independent parallel programming model. Besides the …

Optimizing data layouts for parallel computation on multicores

Y Zhang, W Ding, J Liu… - … Conference on Parallel …, 2011 - ieeexplore.ieee.org
The emergence of multicore platforms offers several opportunities for boosting application
performance. These opportunities, which include parallelism and data locality benefits …

Effective automatic computation placement and data allocation for parallelization of regular programs

C Reddy, U Bondhugula - Proceedings of the 28th ACM international …, 2014 - dl.acm.org
This paper proposes techniques for data allocation and computation mapping when
compiling affine loop nest sequences for distributed-memory clusters. Techniques for …

[图书][B] Automatic computation and data decomposition for multiprocessors

JAM Anderson - 1997 - search.proquest.com
Memory subsystem efficiency is critical to achieving high performance on parallel machines.
The memory subsystem organization of modern multiprocessor architectures makes their …

A data layout optimization framework for nuca-based multicores

Y Zhang, W Ding, M Kandemir, J Liu… - Proceedings of the 44th …, 2011 - dl.acm.org
Future multicore architectures are likely to include a large number of cores connected using
an on-chip network with Non-uniform Cache Access (NUCA). In such architectures, whether …