OpenMP device offloading to FPGA accelerators

L Sommer, J Korinth, A Koch - 2017 IEEE 28th International …, 2017 - ieeexplore.ieee.org
Future high-performance computing systems will need to include multiple specialized
accelerators in a single heterogeneous system to overcome power-density limitations of …

From software threads to parallel hardware in high-level synthesis for FPGAs

J Choi, S Brown, J Anderson - 2013 International Conference …, 2013 - ieeexplore.ieee.org
We describe the support within high-level hardware synthesis (HLS) for two standard
software parallelization paradigms: Pthreads and OpenMP. Parallel code segments, as …

Interplay of loop unrolling and multidimensional memory partitioning in HLS

A Cilardo, L Gallo - 2015 Design, Automation & Test in Europe …, 2015 - ieeexplore.ieee.org
This paper deals with memory partitioning in the context of high-level synthesis for FPGA
technologies. In particular, the work focuses on the area overhead caused by partitioning …

OpenMP on FPGAs—a survey

F Mayer, M Knaust, M Philippsen - … New Zealand, September 11–13, 2019 …, 2019 - Springer
Due to the ubiquity of OpenMP and the rise of FPGA-based accelerators in the HPC world,
several research groups have attempted to bring the two together by building OpenMP-to …

Design space exploration for high-level synthesis of multi-threaded applications

A Cilardo, L Gallo, N Mazzocca - Journal of Systems Architecture, 2013 - Elsevier
We present an ESL methodology creating a direct path from high-level multi-threaded
OpenMP applications to automatically synthesized, heterogeneous hardware/software …

Automated synthesis of FPGA-based heterogeneous interconnect topologies

A Cilardo, E Fusella, L Gallo… - 2013 23rd International …, 2013 - ieeexplore.ieee.org
The choice of the communication topology in many systems is of vital importance because it
affects the entire inter-component data traffic and impacts significantly the overall system …

OpenMP device offloading to FPGAs using the Nymble infrastructure

J Huthmann, L Sommer, A Podobas, A Koch… - OpenMP: Portable Multi …, 2020 - Springer
Abstract Next to GPUs, FPGAs are an attractive target for OpenMP device offloading, as they
allow to implement highly efficient, applications-specific accelerators. However, prior …

Joint communication scheduling and interconnect synthesis for FPGA-based many-core systems

A Cilardo, E Fusella, L Gallo… - 2014 Design, Automation …, 2014 - ieeexplore.ieee.org
This work proposes an automated methodology for optimizing FPGA-based many-core
interconnect architectures. Based on the application communication requirements, the …

Area implications of memory partitioning for high-level synthesis on FPGAs

L Gallo, A Cilardo, D Thomas, S Bayliss… - … Conference on Field …, 2014 - ieeexplore.ieee.org
FPGAs normally have numerous independent memory banks that can be accessed
simultaneously, potentially offering a very large memory bandwidth. Adopting a suitable …

Hardware synthesis of weakly consistent C concurrency

N Ramanathan, ST Fleming, J Wickerson… - Proceedings of the …, 2017 - dl.acm.org
Lock-free algorithms, in which threads synchronise not via coarse-grained mutual exclusion
but via fine-grained atomic operations ('atomics'), have been shown empirically to be the …