The tao of parallelism in algorithms

K Pingali, D Nguyen, M Kulkarni, M Burtscher… - Proceedings of the …, 2011 - dl.acm.org
For more than thirty years, the parallel programming community has used the dependence
graph as the main abstraction for reasoning about and exploiting parallelism in" regular" …

A survey on thread-level speculation techniques

A Estebanez, DR Llanos… - ACM Computing Surveys …, 2016 - dl.acm.org
Thread-Level Speculation (TLS) is a promising technique that allows the parallel execution
of sequential code without relying on a prior, compile-time-dependence analysis. In this …

Morph algorithms on GPUs

R Nasre, M Burtscher, K Pingali - Proceedings of the 18th ACM SIGPLAN …, 2013 - dl.acm.org
There is growing interest in using GPUs to accelerate graph algorithms such as breadth-first
search, computing page-ranks, and finding shortest paths. However, these algorithms do not …

Ordered vs. unordered: a comparison of parallelism and work-efficiency in irregular algorithms

MA Hassaan, M Burtscher, K Pingali - Acm Sigplan Notices, 2011 - dl.acm.org
Outside of computational science, most problems are formulated in terms of irregular data
structures such as graphs, trees and sets. Unfortunately, we understand relatively little about …

A fast GPU algorithm for graph connectivity

J Soman, K Kishore… - 2010 IEEE International …, 2010 - ieeexplore.ieee.org
Graphics processing units provide a large computational power at a very low price which
position them as an ubiquitous accelerator. General purpose programming on the graphics …

Parallel inclusion-based points-to analysis

M Méndez-Lojo, A Mathew, K Pingali - Proceedings of the ACM …, 2010 - dl.acm.org
Inclusion-based points-to analysis provides a good trade-off between precision of results
and speed of analysis, and it has been incorporated into several production compilers …

SIMD parallelization of applications that traverse irregular data structures

B Ren, G Agrawal, JR Larus… - Proceedings of the …, 2013 - ieeexplore.ieee.org
Fine-grained data parallelism is increasingly common in mainstream processors in the form
of longer vectors and on-chip GPUs. This paper develops support for exploiting such data …

Accelerating irregular algorithms on gpgpus using fine-grain hardware worklists

JY Kim, C Batten - 2014 47th Annual IEEE/ACM International …, 2014 - ieeexplore.ieee.org
Although GPGPUs are traditionally used to accelerate workloads with regular control and
memory-access structure, recent work has shown that GPGPUs can also achieve significant …

Parallel FPGA routing: Survey and challenges

M Stojilović - 2017 27th International Conference on Field …, 2017 - ieeexplore.ieee.org
As transistor scaling is slowing down [1], other opportunities for ensuring continuous
performance increase have to be explored. Field programmable gate arrays (FPGAs) are in …

Parallel FPGA routing based on the operator formulation

YOM Moctar, P Brisk - Proceedings of the 51st Annual Design …, 2014 - dl.acm.org
We have implemented an FPGA routing algorithm on a shared memory multi-processor
using the Galois API, which offers speculative parallelism in software. The router is a parallel …