Compiler optimizations for eliminating barrier synchronization

CW Tseng - ACM SIGPLAN Notices, 1995 - dl.acm.org
This paper presents novel compiler optimizations for reducing synchronization overhead in
compiler-parallelized scientific codes. A hybrid programming model is employed to combine …

[PDF][PDF] A linear algebra framework for static HPF code distribution

C Ancourt, C Fran, IR Keryell - A; a, 1993 - researchgate.net
Abstract High Performance Fortran (hpf) was developed to support data parallel
programming for simd and mimd machines with distributed memory. The programmer is …

PIPS is not (just) polyhedral software adding GPU code generation in PIPS

M Amini, C Ancourt, F Coelho… - … (IMPACT 2011) in …, 2011 - minesparis-psl.hal.science
Parallel and heterogeneous computing are growing in audience thanks to the increased
performance brought by ubiquitous manycores and GPUs. However, available programming …

A linear algebra framework for static High Performance Fortran code distribution

C Ancourt, F Coelho, F Irigoin… - Scientific …, 1997 - Wiley Online Library
High Performance Fortran (HPF) was developed to support data parallel programming for
single‐instruction multiple‐data (SIMD) and multiple‐instruction multiple‐data (MIMD) …

On the parallel execution time of tiled loops

K Hogstedt, L Carter, J Ferrante - IEEE Transactions on Parallel …, 2003 - ieeexplore.ieee.org
Many computationally-intensive programs, such as those for differential equations, spatial
interpolation, and dynamic programming, spend a large portion of their execution time in …

A compiler and runtime infrastructure for automatic program distribution

RE Diaconescu, L Wang, Z Mouri… - 19th IEEE International …, 2005 - ieeexplore.ieee.org
This paper presents the design and the implementation of a compiler and runtime
infrastructure for automatic program distribution. We are building a research infrastructure …

Concerto: a program parallelization, orchestration and distribution infrastructure

S Kalyur, GS Nagaraja - 2017 2nd international conference on …, 2017 - ieeexplore.ieee.org
The important step in Program Parallelization, is identifying the pieces of the given program,
that can be run concurrently, on separate processing elements. The parallel pieces once …

Polyedres et compilation

F Irigoin, M Amini, C Ancourt… - Rencontres …, 2011 - minesparis-psl.hal.science
La première utilisation de polyèdres pour résoudre un problème de compilation, la
parallélisation automatique de boucles en présence d'appels de procédure, a été décrite et …

[PDF][PDF] Automatic distribution of java byte-code based on dependence analysis

RE Diaconescu, L Wang, M Franz - … Report Technical Report No. 03-18, 2003 - Citeseer
One way to relieve resources when executing a program on constrained devices is to
migrate parts of it to other machines in a distributed system. Ideally, a system can …

Compiler and System Techniques for soc Distributed Reconfigurable Accelerators

J Cambonie, S Guérin, R Keryell, L Lagadec… - … Workshop on Embedded …, 2004 - Springer
To answer new challenges, systems on chip need to gain flexibility and fpga s need to gain
structure. We propose a general framework for SoC architectures and software tools in …