Effective extensible programming: unleashing Julia on GPUs

T Besard, C Foket, B De Sutter - IEEE Transactions on Parallel …, 2018 - ieeexplore.ieee.org
GPUs and other accelerators are popular devices for accelerating compute-intensive,
parallelizable applications. However, programming these devices is a difficult task. Writing …

Generating custom code for efficient query execution on heterogeneous processors

S Breß, B Köcher, H Funke, S Zeuch, T Rabl, V Markl - The VLDB Journal, 2018 - Springer
Processor manufacturers build increasingly specialized processors to mitigate the effects of
the power wall in order to deliver improved performance. Currently, database engines have …

Navigating the landscape for real-time localization and mapping for robotics and virtual and augmented reality

S Saeedi, B Bodin, H Wagstaff, A Nisbet… - Proceedings of the …, 2018 - ieeexplore.ieee.org
Visual understanding of 3-D environments in real time, at low power, is a huge
computational challenge. Often referred to as simultaneous localization and mapping …

A performance portability framework for Python

N Al Awar, S Zhu, G Biros, M Gligoric - Proceedings of the 35th ACM …, 2021 - dl.acm.org
Kokkos is a programming model for writing performance portable applications for all major
high performance computing platforms. It provides abstractions for data management and …

Exploiting high-performance heterogeneous hardware for java programs using graal

J Clarkson, J Fumero, M Papadimitriou… - Proceedings of the 15th …, 2018 - dl.acm.org
The proliferation of heterogeneous hardware in recent years means that every system we
program is likely to include a mix of compute elements; each with different characteristics. By …

Heterogeneous managed runtime systems: A computer vision case study

C Kotselidis, J Clarkson, A Rodchenko… - Proceedings of the 13th …, 2017 - dl.acm.org
Real-time 3D space understanding is becoming prevalent across a wide range of
applications and hardware platforms. To meet the desired Quality of Service (QoS) …

Python programmers have GPUs too: automatic Python loop parallelization with staged dependence analysis

D Jacob, P Trinder, J Singer - Proceedings of the 15th ACM SIGPLAN …, 2019 - dl.acm.org
Python is a popular language for end-user software development in many application
domains. End-users want to harness parallel compute resources effectively, by exploiting …

ALPyNA: acceleration of loops in Python for novel architectures

D Jacob, J Singer - Proceedings of the 6th ACM SIGPLAN International …, 2019 - dl.acm.org
We present ALPyNA, an automatic loop parallelization framework for Python, which
analyzes data dependences within nested loops and dynamically generates CUDA kernels …

Towards practical heterogeneous virtual machines

J Clarkson, J Fumero, M Papadimitriou… - … Proceedings of the 2nd …, 2018 - dl.acm.org
Heterogeneous computing has emerged as a means to achieve high performance and
energy efficiency. Naturally, this trend has been accompanied by changes in software …

High-level GPU programming in Julia

T Besard, P Verstraete, B De Sutter - arXiv preprint arXiv:1604.03410, 2016 - arxiv.org
GPUs are popular devices for accelerating scientific calculations. However, as GPU code is
usually written in low-level languages, it breaks the abstractions of high-level languages …