A mapping path for multi-GPGPU accelerated computers from a portable high level programming abstraction

A Leung, N Vasilache, B Meister, M Baskaran… - Proceedings of the 3rd …, 2010 - dl.acm.org
Programmers for GPGPU face rapidly changing substrate of programming abstractions,
execution models, and hardware implementations. It has been established, through …

Polyhedral process networks

S Verdoolaege - Handbook of Signal Processing Systems, 2013 - Springer
Reference implementations of signal processing applications are often written in a
sequential language that does not reveal the available parallelism in the application …

[PDF][PDF] Integer set library: Manual

S Verdoolaege - Tech. Rep., 2011 - compsys-tools.ens-lyon.fr
1.1 Introduction isl is a thread-safe C library for manipulating sets and relations of integer
points bounded by affine constraints. The descriptions of the sets and relations may involve …

Efficient hierarchical online-autotuning: a case study on polyhedral accelerator mapping

P Pfaffe, T Grosser, M Tillmann - … of the ACM International Conference on …, 2019 - dl.acm.org
Identifying the (near) optimal program variants an optimizing and parallelizing compiler
should generate is known to be difficult. Autotuning is the best solution to navigate the often …

Positivity, posynomials and tile size selection

L Renganarayana… - SC'08: Proceedings of the …, 2008 - ieeexplore.ieee.org
Tiling is a widely used loop transformation for exposing/exploiting parallelism and data
locality. Effective use of tiling requires selection and tuning of the tile sizes. This is usually …

[PDF][PDF] Productivity via automatic code generation for pgas platforms with the r-stream compiler

B Meister, A Leung, N Vasilache… - … on Asynchrony in the …, 2009 - researchgate.net
Emerging computing architectures present concurrent, heterogeneous, and hierarchical
organizations. Explicit management of distributed memories, bulk communications, and the …

[PDF][PDF] Trading off memory for parallelism quality

N Vasilache, B Meister, A Hartono, M Baskaran… - … Workshop on Polyhedral …, 2012 - Citeseer
We detail an algorithm implemented in the R-Stream compiler1 to perform controlled array
expansion and conversion to partial single-assignment form, which consists of (1) allowing …

Automatic loop tiling for direct memory access

H Lin, T Liu, L Renganarayana, H Li… - … Parallel & Distributed …, 2011 - ieeexplore.ieee.org
In heterogeneous multi-core systems, such as the Cell BE processor, each accelerator core
has its own fast local memory without hardware supported coherence and the software is …

Memory reuse optimizations in the R-Stream compiler

N Vasilache, M Baskaran, B Meister… - Proceedings of the 6th …, 2013 - dl.acm.org
We propose a new set of automated techniques to optimize memory reuse in programs with
explicitly managed memory. Our techniques are inspired by hand-tuned seismic kernels on …

Systems and methods for energy proportional scheduling

MM Baskaran, T Henretty, A Johnson… - US Patent …, 2022 - Google Patents
A compilation system using an energy model based on a set of generic and practical
hardware and software parameters is presented. The model can represent the major trends …