Jockey: guaranteed job latency in data parallel clusters

AD Ferguson, P Bodik, S Kandula, E Boutin… - Proceedings of the 7th …, 2012 - dl.acm.org
Data processing frameworks such as MapReduce [8] and Dryad [11] are used today in
business environments where customers expect guaranteed performance. To date …

MapReduce optimization using regulated dynamic prioritization

T Sandholm, K Lai - Proceedings of the eleventh international joint …, 2009 - dl.acm.org
We present a system for allocating resources in shared data and compute clusters that
improves MapReduce job scheduling in three ways. First, the system uses regulated and …

Dynamicmr: A dynamic slot allocation optimization framework for mapreduce clusters

S Tang, BS Lee, B He - IEEE Transactions on Cloud …, 2014 - ieeexplore.ieee.org
MapReduce is a popular computing paradigm for large-scale data processing in cloud
computing. However, the slot-based MapReduce system (eg, Hadoop MRv1) can suffer from …

[PDF][PDF] Job scheduling for multi-user mapreduce clusters

M Zaharia, D Borthakur, JS Sarma… - … , Tech. Rep. UCB …, 2009 - digitalassets.lib.berkeley.edu
Sharing a MapReduce cluster between users is attractive because it enables statistical
multiplexing (lowering costs) and allows users to share a common large data set. However …

Aria: automatic resource inference and allocation for mapreduce environments

A Verma, L Cherkasova, RH Campbell - Proceedings of the 8th ACM …, 2011 - dl.acm.org
MapReduce and Hadoop represent an economically compelling alternative for efficient
large scale data processing and advanced analytics in the enterprise. A key challenge in …

Delay tails in MapReduce scheduling

J Tan, X Meng, L Zhang - Proceedings of the 12th ACM SIGMETRICS …, 2012 - dl.acm.org
MapReduce/Hadoop production clusters exhibit heavy-tailed characteristics for job
processing times. These phenomena are resultant of the workload features and the adopted …

Resource-aware adaptive scheduling for mapreduce clusters

J Polo, C Castillo, D Carrera, Y Becerra… - Middleware 2011: ACM …, 2011 - Springer
We present a resource-aware scheduling technique for MapReduce multi-job workloads that
aims at improving resource utilization across machines while observing completion time …

Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling

M Zaharia, D Borthakur, J Sen Sarma… - Proceedings of the 5th …, 2010 - dl.acm.org
As organizations start to use data-intensive cluster computing systems like Hadoop and
Dryad for more applications, there is a growing need to share clusters between users …

Balancing reducer skew in MapReduce workloads using progressive sampling

SR Ramakrishnan, G Swart, A Urmanov - Proceedings of the Third ACM …, 2012 - dl.acm.org
The elapsed time of a parallel job depends on the completion time of its longest running
constituent. We present a static load balancing algorithm that distributes work evenly across …

[PDF][PDF] Improving MapReduce performance in heterogeneous environments.

M Zaharia, A Konwinski, AD Joseph, RH Katz, I Stoica - Osdi, 2008 - usenix.org
MapReduce is emerging as an important programming model for large-scale data-parallel
applications such as web indexing, data mining, and scientific simulation. Hadoop is an …