The data locality of work stealing

UA Acar, GE Blelloch, RD Blumofe - Proceedings of the twelfth annual …, 2000 - dl.acm.org
This paper studies the data locality of the work-stealing scheduling algorithm on hardware-
controlled shared-memory machines. We present lower and upper bounds on the number of …

Hierarchical work-stealing

JN Quintin, F Wagner - Euro-Par 2010-Parallel Processing: 16th …, 2010 - Springer
Abstract dynamic load-balancing on hierarchical platforms. In particular, we consider
applications involving heavy communications on a distributed platform. The work-stealing …

Effectively sharing a cache among threads

GE Blelloch, PB Gibbons - Proceedings of the sixteenth annual ACM …, 2004 - dl.acm.org
We compare the number of cache misses M 1 for running a computation on a single
processor with cache size C 1 to the total number of misses Mp for the same computation …

Thread scheduling for multiprogrammed multiprocessors

NS Arora, RD Blumofe, CG Plaxton - Proceedings of the tenth annual …, 1998 - dl.acm.org
We present a user-level thread scheduler for shared-memory multiprocessors, and we
analyze its performance under multiprogramming. We model multiprogramming with two …

Scheduling parallel programs by work stealing with private deques

UA Acar, A Charguéraud, M Rainey - Proceedings of the 18th ACM …, 2013 - dl.acm.org
Work stealing has proven to be an effective method for scheduling parallel programs on
multicore computers. To achieve high performance, work stealing distributes tasks between …

Idempotent work stealing

MM Michael, MT Vechev, VA Saraswat - Proceedings of the 14th ACM …, 2009 - dl.acm.org
Load balancing is a technique which allows efficient parallelization of irregular workloads,
and a key component of many applications and parallelizing runtimes. Work-stealing is a …

Scheduling irregular parallel computations on hierarchical caches

GE Blelloch, JT Fineman, PB Gibbons… - Proceedings of the …, 2011 - dl.acm.org
For nested-parallel computations with low depth (span, critical path length) analyzing the
work, depth, and sequential cache complexity suffices to attain reasonably strong bounds on …

Scheduling multithreaded computations by work stealing

RD Blumofe, CE Leiserson - Journal of the ACM (JACM), 1999 - dl.acm.org
This paper studies the problem of efficiently schedulling fully strict (ie, well-structured)
multithreaded computations on parallel computers. A popular and practical method of …

Scalable work stealing

J Dinan, DB Larkins, P Sadayappan… - Proceedings of the …, 2009 - dl.acm.org
Irregular and dynamic parallel applications pose significant challenges to achieving
scalable performance on large-scale multicore clusters. These applications often require …

Lifeline-based global load balancing

VA Saraswat, P Kambadur, S Kodali, D Grove… - ACM SIGPLAN …, 2011 - dl.acm.org
On shared-memory systems, Cilk-style work-stealing has been used to effectively parallelize
irregular task-graph based applications such as Unbalanced Tree Search (UTS). There are …