AnyDSL: A partial evaluation framework for programming high-performance libraries

R Leißa, K Boesche, S Hack, A Pérard-Gayot… - Proceedings of the …, 2018 - dl.acm.org
This paper advocates programming high-performance code using partial evaluation. We
present a clean-slate programming system with a simple, annotation-based, online partial …

Dynamic application reconfiguration on heterogeneous hardware

J Fumero, M Papadimitriou, FS Zakkak… - Proceedings of the 15th …, 2019 - dl.acm.org
By utilizing diverse heterogeneous hardware resources, developers can significantly
improve the performance of their applications. Currently, in order to determine which parts of …

Exploiting high-performance heterogeneous hardware for java programs using graal

J Clarkson, J Fumero, M Papadimitriou… - Proceedings of the 15th …, 2018 - dl.acm.org
The proliferation of heterogeneous hardware in recent years means that every system we
program is likely to include a mix of compute elements; each with different characteristics. By …

[HTML][HTML] {VectorVisor}: A Binary Translation Scheme for {Throughput-Oriented}{GPU} Acceleration

S Ginzburg, M Shahrad, MJ Freedman, Z Wen… - 2023 USENIX Annual …, 2023 - usenix.org
Papers are available for download below to registered attendees now. The papers and the
full proceedings will be available to everyone beginning Monday, July 10, 2023. Paper …

Python programmers have GPUs too: automatic Python loop parallelization with staged dependence analysis

D Jacob, P Trinder, J Singer - Proceedings of the 15th ACM SIGPLAN …, 2019 - dl.acm.org
Python is a popular language for end-user software development in many application
domains. End-users want to harness parallel compute resources effectively, by exploiting …

Transparent compiler and runtime specializations for accelerating managed languages on fpgas

M Papadimitriou, J Fumero, A Stratikopoulos… - arXiv preprint arXiv …, 2020 - arxiv.org
In recent years, heterogeneous computing has emerged as the vital way to increase
computers? performance and energy efficiency by combining diverse hardware devices …

Scootr: Scaling r dataframes on dataflow systems

A Kunft, L Stadler, D Bonetta, C Basca… - Proceedings of the …, 2018 - dl.acm.org
To cope with today's large scale of data, parallel dataflow engines such as Hadoop, and
more recently Spark and Flink, have been proposed. They offer scalability and performance …

r3d3: Optimized query compilation on gpus

A Krolik, C Verbrugge, L Hendren - 2021 IEEE/ACM …, 2021 - ieeexplore.ieee.org
Query compilation is an effective approach to improve the performance of repeated
database queries. GPU-based approaches have significant promise, but face difficulties in …

Modular array-based gpu computing in a dynamically-typed language

M Springer, P Wauligmann, H Masuhara - Proceedings of the 4th ACM …, 2017 - dl.acm.org
Nowadays, GPU accelerators are widely used in areas with large data-parallel computations
such as scientific computations or neural networks. Programmers can either write code in …

Enabling pipeline parallelism in heterogeneous managed runtime environments via batch processing

F Blanaru, A Stratikopoulos, J Fumero… - Proceedings of the 18th …, 2022 - dl.acm.org
During the last decade, managed runtime systems have been constantly evolving to become
capable of exploiting underlying hardware accelerators, such as GPUs and FPGAs …