Morpheus: Towards automated {SLOs} for enterprise clusters

SA Jyothi, C Curino, I Menache… - … USENIX symposium on …, 2016 - usenix.org
Modern resource management frameworks for largescale analytics leave unresolved the
problematic tension between high cluster utilization and job's performance predictability …

Selecta: Heterogeneous cloud storage configuration for data analytics

A Klimovic, H Litz, C Kozyrakis - 2018 USENIX Annual Technical …, 2018 - usenix.org
Data analytics are an important class of data-intensive workloads on public cloud services.
However, selecting the right compute and storage configuration for these applications is …

Using machine learning to optimize parallelism in big data applications

ÁB Hernández, MS Perez, S Gupta… - Future Generation …, 2018 - Elsevier
In-memory cluster computing platforms have gained momentum in the last years, due to their
ability to analyse big amounts of data in parallel. These platforms are complex and difficult-to …

Cost-effective resource provisioning for mapreduce in a cloud

B Palanisamy, A Singh, L Liu - IEEE Transactions on Parallel …, 2014 - ieeexplore.ieee.org
This paper presents a new MapReduce cloud service model, Cura, for provisioning cost-
effective MapReduce services in a cloud. In contrast to existing MapReduce cloud services …

Efficient deep learning pipelines for accurate cost estimations over large scale query workload

JK Zhi Kang, Gaurav, SY Tan, F Cheng… - Proceedings of the 2021 …, 2021 - dl.acm.org
The use of deep learning models for forecasting the resource consumption patterns of SQL
queries have recently been a popular area of study. While these models have demonstrated …

An online algorithm for scheduling big data analysis jobs in cloud environments

Y Kang, L Pan, S Liu - Knowledge-Based Systems, 2022 - Elsevier
Cloud computing has become a popular platform for processing big data analysis jobs with
its advantages of high-availability, elasticity and cost-efficiency. Many big data analysis …

Reinforcement learning based scheduling in a workflow management system

AM Kintsakis, FE Psomopoulos, PA Mitkas - Engineering Applications of …, 2019 - Elsevier
Any computational process from simple data analytics tasks to training a machine learning
model can be described by a workflow. Many workflow management systems (WMS) exist …

Predict: towards predicting the runtime of large scale iterative analytics

AD Popescu, A Balmin, V Ercegovac… - Proceedings of the …, 2013 - infoscience.epfl.ch
Abstract Machine learning algorithms are widely used today for analytical tasks such as data
cleaning, data categorization, or data filtering. At the same time, the rise of social media …

HFSP: bringing size-based scheduling to hadoop

M Pastorelli, D Carra, M Dell'Amico… - IEEE Transactions on …, 2015 - ieeexplore.ieee.org
Size-based scheduling with aging has been recognized as an effective approach to
guarantee fairness and near-optimal system response times. We present HFSP, a scheduler …

Database workload characterization with query plan encoders

D Paul, J Cao, F Li, V Srikumar - arXiv preprint arXiv:2105.12287, 2021 - arxiv.org
Smart databases are adopting artificial intelligence (AI) technologies to achieve {\em
instance optimality}, and in the future, databases will come with prepackaged AI models …