MIMD Programs Execution Support on SIMD Machines: A Holistic Survey

D Mustafa, R Alkhasawneh, F Obeidat… - IEEE Access, 2024 - ieeexplore.ieee.org
The Single Instruction Multiple Data (SIMD) architecture, supported by various high-
performance computing platforms, efficiently utilizes data-level parallelism. The SIMD model …

GKLEE: concolic verification and test generation for GPUs

G Li, P Li, G Sawaya, G Gopalakrishnan… - Proceedings of the 17th …, 2012 - dl.acm.org
Programs written for GPUs often contain correctness errors such as races, deadlocks, or
may compute the wrong result. Existing debugging tools often miss these errors because of …

Enabling and exploiting flexible task assignment on GPU through SM-centric program transformations

B Wu, G Chen, D Li, X Shen, J Vetter - Proceedings of the 29th ACM on …, 2015 - dl.acm.org
A GPU's computing power lies in its abundant memory bandwidth and massive parallelism.
However, its hardware thread schedulers, despite being able to quickly distribute …

GPU-based NFA implementation for memory efficient high speed regular expression matching

Y Zu, M Yang, Z Xu, L Wang, X Tian, K Peng… - Proceedings of the 17th …, 2012 - dl.acm.org
Regular expression pattern matching is the foundation and core engine of many network
functions, such as network intrusion detection, worm detection, traffic analysis, web …

Sound and partially-complete static analysis of data-races in gpu programs

D Liew, T Cogumbreiro, J Lange - Proceedings of the ACM on …, 2024 - dl.acm.org
GPUs are progressively being integrated into modern society, playing a pivotal role in
Artificial Intelligence and High-Performance Computing. Programmers need a deep …

Characterizing and enhancing global memory data coalescing on GPUs

N Fauzia, LN Pouchet… - 2015 IEEE/ACM …, 2015 - ieeexplore.ieee.org
Effective parallel programming for GPUs requires careful attention to several factors,
including ensuring coalesced access of data from global memory. There is a need for tools …

Llov: A fast static data-race checker for openmp programs

U Bora, S Das, P Kukreja, S Joshi… - ACM Transactions on …, 2020 - dl.acm.org
In the era of Exascale computing, writing efficient parallel programs is indispensable, and, at
the same time, writing sound parallel programs is very difficult. Specifying parallelism with …

PROV-IO: A Cross-Platform Provenance Framework for Scientific Data on HPC Systems

R Han, M Zheng, S Byna, H Tang… - … on Parallel and …, 2024 - ieeexplore.ieee.org
Data provenance, or data lineage, describes the life cycle of data. In scientific workflows on
HPC systems, scientists often seek diverse provenance (eg, origins of data products, usage …

Symbolic testing of OpenCL code

P Collingbourne, C Cadar, PHJ Kelly - … 2011, Haifa, Israel, December 6-8 …, 2012 - Springer
We present an effective technique for crosschecking a C or C++ program against an
accelerated OpenCL version, as well as a technique for detecting data races in OpenCL …

CURD: a dynamic CUDA race detector

Y Peng, V Grover, J Devietti - ACM SIGPLAN Notices, 2018 - dl.acm.org
As GPUs have become an integral part of nearly every pro-cessor, GPU programming has
become increasingly popular. GPU programming requires a combination of extreme levels …