The power of choice in {Data-Aware} cluster scheduling

S Venkataraman, A Panda… - … USENIX Symposium on …, 2014 - usenix.org
Providing timely results in the face of rapid growth in data volumes has become important for
analytical frameworks. For this reason, frameworks increasingly operate on only a subset of …

Improving resource utilization by timely fine-grained scheduling

T Jin, Z Cai, B Li, C Zheng, G Jiang… - Proceedings of the …, 2020 - dl.acm.org
Monotask is a unit of work that uses only a single type of resource (eg, CPU, network, disk
I/O). While monotask was primarily introduced as a means to reason about job performance …

Efficient queue management for cluster scheduling

J Rasley, K Karanasos, S Kandula, R Fonseca… - Proceedings of the …, 2016 - dl.acm.org
Job scheduling in Big Data clusters is crucial both for cluster operators' return on investment
and for overall user experience. In this context, we observe several anomalies in how …

{GRAPHENE}: Packing and {Dependency-Aware} scheduling for {Data-Parallel} clusters

R Grandl, S Kandula, S Rao, A Akella… - 12th USENIX Symposium …, 2016 - usenix.org
We present a new cluster scheduler, GRAPHENE, aimed at jobs that have a complex
dependency structure and heterogeneous resource demands. Relaxing either of these …

The case for tiny tasks in compute clusters

K Ousterhout, A Panda, J Rosen… - 14th Workshop on Hot …, 2013 - usenix.org
We argue for breaking data-parallel jobs in compute clusters into tiny tasks that each
complete in hundreds of milliseconds. Tiny tasks avoid the need for complex skew mitigation …

Morpheus: Towards automated {SLOs} for enterprise clusters

SA Jyothi, C Curino, I Menache… - … USENIX symposium on …, 2016 - usenix.org
Modern resource management frameworks for largescale analytics leave unresolved the
problematic tension between high cluster utilization and job's performance predictability …

Network-aware scheduling for data-parallel jobs: Plan when you can

V Jalaparti, P Bodik, I Menache, S Rao… - ACM SIGCOMM …, 2015 - dl.acm.org
To reduce the impact of network congestion on big data jobs, cluster management
frameworks use various heuristics to schedule compute tasks and/or network flows. Most of …

Tarcil: Reconciling scheduling speed and quality in large shared clusters

C Delimitrou, D Sanchez, C Kozyrakis - … of the Sixth ACM Symposium on …, 2015 - dl.acm.org
Scheduling diverse applications in large, shared clusters is particularly challenging. Recent
research on cluster scheduling focuses either on scheduling speed, using sampling to …

Medea scheduling of long running applications in shared production clusters

P Garefalakis, K Karanasos, P Pietzuch… - Proceedings of the …, 2018 - dl.acm.org
The rise in popularity of machine learning, streaming, and latency-sensitive online
applications in shared production clusters has raised new challenges for cluster schedulers …

Sparrow: distributed, low latency scheduling

K Ousterhout, P Wendell, M Zaharia… - Proceedings of the twenty …, 2013 - dl.acm.org
Large-scale data analytics frameworks are shifting towards shorter task durations and larger
degrees of parallelism to provide low latency. Scheduling highly parallel jobs that complete …