Pushing the level of abstraction of digital system design: A survey on how to program fpgas

ED Sozzo, D Conficconi, A Zeni, M Salaris… - ACM Computing …, 2022 - dl.acm.org
Field Programmable Gate Arrays (FPGAs) are spatial architectures with a heterogeneous
reconfigurable fabric. They are state-of-the-art for prototyping, telecommunications …

Fpga acceleration for big data analytics: Challenges and opportunities

J Hoozemans, J Peltenburg… - IEEE Circuits and …, 2021 - ieeexplore.ieee.org
The big data revolution has ushered an era with ever increasing volumes and complexity of
data requiring ever faster computational analysis. During this very same era, CPU …

Stateful dataflow multigraphs: A data-centric model for performance portability on heterogeneous architectures

T Ben-Nun, J de Fine Licht, AN Ziogas… - Proceedings of the …, 2019 - dl.acm.org
The ubiquity of accelerators in high-performance computing has driven programming
complexity beyond the skill-set of the average domain scientist. To maintain performance …

Exocompilation for productive programming of hardware accelerators

Y Ikarashi, GL Bernstein, A Reinking, H Genc… - Proceedings of the 43rd …, 2022 - dl.acm.org
High-performance kernel libraries are critical to exploiting accelerators and specialized
instructions in many applications. Because compilers are difficult to extend to support …

autoax: An automatic design space exploration and circuit building methodology utilizing libraries of approximate components

V Mrazek, MA Hanif, Z Vasicek, L Sekanina… - Proceedings of the 56th …, 2019 - dl.acm.org
Approximate computing is an emerging paradigm for developing highly energy-efficient
computing systems such as various accelerators. In the literature, many libraries of …

Mnnfast: A fast and scalable system architecture for memory-augmented neural networks

H Jang, J Kim, JE Jo, J Lee, J Kim - Proceedings of the 46th International …, 2019 - dl.acm.org
Memory-augmented neural networks are getting more attention from many researchers as
they can make an inference with the previous history stored in memory. Especially, among …

Modular, compositional, and executable formal semantics for LLVM IR

Y Zakowski, C Beck, I Yoon, I Zaichuk, V Zaliva… - Proceedings of the …, 2021 - dl.acm.org
This paper presents a novel formal semantics, mechanized in Coq, for a large, sequential
subset of the LLVM IR. In contrast to previous approaches, which use relationally-specified …

Impress: Large integer multiplication expression rewriting for fpga hls

E Ustun, I San, J Yin, C Yu… - 2022 IEEE 30th Annual …, 2022 - ieeexplore.ieee.org
Large integer multiplication is becoming a major challenge for FPGA-based acceleration of
many cryptographic applications. Existing techniques for decomposing and optimizing large …

Workflows are the new applications: Challenges in performance, portability, and productivity

T Ben-Nun, T Gamblin, DS Hollman… - 2020 IEEE/ACM …, 2020 - ieeexplore.ieee.org
Great strides have been made to enable performance, portability, and productivity in HPC,
but the focus has so far been on standalone applications and on-node programming …

Incremental flattening for nested data parallelism

T Henriksen, F Thorøe, M Elsman… - Proceedings of the 24th …, 2019 - dl.acm.org
Compilation techniques for nested-parallel applications that can adapt to hardware and
dataset characteristics are vital for unlocking the power of modern hardware. This paper …