A survey of algorithmic skeleton frameworks: high‐level structured parallel programming enablers

H González‐Vélez, M Leyton - Software: Practice and …, 2010 - Wiley Online Library
Structured parallel programs ought to be conceived as two separate and complementary
entities: computation, which expresses the calculations in a procedural manner, and …

Fast conical hull algorithms for near-separable non-negative matrix factorization

A Kumar, V Sindhwani… - … Conference on Machine …, 2013 - proceedings.mlr.press
The separability assumption (Arora et al., 2012; Donoho & Stodden, 2003) turns non-
negative matrix factorization (NMF) into a tractable problem. Recently, a new class of …

Lifeline-based global load balancing

VA Saraswat, P Kambadur, S Kodali, D Grove… - ACM SIGPLAN …, 2011 - dl.acm.org
On shared-memory systems, Cilk-style work-stealing has been used to effectively parallelize
irregular task-graph based applications such as Unbalanced Tree Search (UTS). There are …

NIMBLE: a toolkit for the implementation of parallel data mining and machine learning algorithms on mapreduce

A Ghoting, P Kambadur, E Pednault… - Proceedings of the 17th …, 2011 - dl.acm.org
In the last decade, advances in data collection and storage technologies have led to an
increased interest in designing and implementing large-scale parallel algorithms for …

A work-stealing scheduler for X10's task parallelism with suspension

O Tardieu, H Wang, H Lin - ACM Sigplan Notices, 2012 - dl.acm.org
The X10 programming language is intended to ease the programming of scalable
concurrent and distributed applications. X10 augments a familiar imperative object-oriented …

Work-stealing without the baggage

V Kumar, D Frampton, SM Blackburn, D Grove… - ACM SIGPLAN …, 2012 - dl.acm.org
Work-stealing is a promising approach for effectively exploiting software parallelism on
parallel hardware. A programmer who uses work-stealing explicitly identifies potential …

On the merits of distributed work-stealing on selective locality-aware tasks

J Paudel, O Tardieu, JN Amaral - 2013 42nd International …, 2013 - ieeexplore.ieee.org
Improving the performance of work-stealing load-balancing algorithms in distributed shared-
memory systems is challenging. These algorithms need to overcome high costs of …

A study of shared-memory parallelism in a multifrontal solver

JY L'Excellent, WM Sid-Lakhdar - Parallel Computing, 2014 - Elsevier
We introduce shared-memory parallelism in a parallel distributed-memory solver, targeting
multi-core architectures. Our concern in this paper is pure shared-memory parallelism …

A runtime implementation of openmp tasks

J LaGrone, A Aribuki, C Addison… - International Workshop on …, 2011 - Springer
Many task-based programming models have been developed and refined in recent years to
support application development for shared memory platforms. Asynchronous tasks are a …

Scaling the solution of large sparse linear systems using multifrontal methods on hybrid shared-distributed memory architectures

MWS Lakhdar - 2014 - inria.hal.science
The solution of sparse systems of linear equations is at the heart of numerous
applicationfields. While the amount of computational resources in modern architectures …