Parallel programming models for heterogeneous many-cores: a comprehensive survey

J Fang, C Huang, T Tang, Z Wang - CCF Transactions on High …, 2020 - Springer
Heterogeneous many-cores are now an integral part of modern computing systems ranging
from embedding systems to supercomputers. While heterogeneous many-core design offers …

AnyDSL: A partial evaluation framework for programming high-performance libraries

R Leißa, K Boesche, S Hack, A Pérard-Gayot… - Proceedings of the …, 2018 - dl.acm.org
This paper advocates programming high-performance code using partial evaluation. We
present a clean-slate programming system with a simple, annotation-based, online partial …

Partial control-flow linearization

S Moll, S Hack - ACM SIGPLAN Notices, 2018 - dl.acm.org
If-conversion is a fundamental technique for vectorization. It accounts for the fact that in a
SIMD program, several targets of a branch might be executed because of divergence …

Parsimony: Enabling SIMD/Vector Programming in Standard Compiler Flows

V Kandiah, D Lustig, O Villa, D Nellans… - Proceedings of the 21st …, 2023 - dl.acm.org
Achieving peak throughput on modern CPUs requires maximizing the use of single-
instruction, multiple-data (SIMD) or vector compute units. Single-program, multiple-data …

Harnessing parallelism in multi/many-cores with streams and parallel patterns

M Torquati - 2019 - tesidottorato.depositolegale.it
Multi-core computing systems are becoming increasingly parallel and heterogeneous.
Parallelism exploitation is today the primary instrument for improving application …

WCCV: Improving the vectorization of IF-statements with warp-coherent conditions

H Sun, F Fey, J Zhao, S Gorlatch - Proceedings of the ACM International …, 2019 - dl.acm.org
When vectorizing programs for modern processors with SIMD extensions, IF-statements
pose a challenge: existing vectorization approaches often introduce redundant …

[PDF][PDF] An auto-programming approach to Vulkan

V Frolov, V Sanzharov, V Galaktionov… - Proceedings of the …, 2021 - researchgate.net
We propose a novel high-level approach for software development on GPU using Vulkan
API. Our goal is to speed-up development and performance studies for complex algorithms …

Multi-dimensional Vectorization in LLVM

S Moll, S Sharma, M Kurtenacker, S Hack - Proceedings of the 5th …, 2019 - dl.acm.org
Loop vectorization is a classic technique to exploit SIMD instructions in a productive way. In
multi-dimensional vectorization, multiple loops of a loop nest are vectorized at once. This …

A data layout transformation for vectorizing compilers

A Pérard-Gayot, R Membarth, P Slusallek… - Proceedings of the …, 2018 - dl.acm.org
Modern processors are often equipped with vector instruction sets. Such instructions operate
on multiple elements of data at once, and greatly improve performance for specific …

Refactoring loops with nested ifs for simd extensions without masked instructions

H Sun, S Gorlatch, R Zhao - … Euro-Par 2018 International Workshops, Turin …, 2019 - Springer
Most CPUs in heterogeneous systems are now equipped with SIMD (Single Instruction
Multiple Data) extensions that operate on short vectors in parallel to enable high …