A survey on automatic parameter tuning for big data processing systems

H Herodotou, Y Chen, J Lu - ACM Computing Surveys (CSUR), 2020 - dl.acm.org
Big data processing systems (eg, Hadoop, Spark, Storm) contain a vast number of
configuration parameters controlling parallelism, I/O behavior, memory settings, and …

Automatic performance tuning for distributed data stream processing systems

H Herodotou, L Odysseos, Y Chen… - 2022 IEEE 38th …, 2022 - ieeexplore.ieee.org
Distributed data stream processing systems (DSPSs) such as Storm, Flink, and Spark
Streaming are now routinely used to process continuous data streams in (near) real-time …

Real-time resource scaling platform for big data workloads on serverless environments

J Enes, RR Expósito, J Touriño - Future Generation Computer Systems, 2020 - Elsevier
The serverless execution paradigm is becoming an increasingly popular option when
workloads are to be deployed in an abstracted way, more specifically, without specifying any …

On combining system and machine learning performance tuning for distributed data stream applications

L Odysseos, H Herodotou - Distributed and Parallel Databases, 2023 - Springer
The growing need to identify patterns in data and automate decisions based on them in near-
real time, has stimulated the development of new machine learning (ML) applications …

Srautinio apdorojimo sistemų balansavimas taikant skatinamąjį mokymąsi

V Žilinas - 2021 - epublications.vu.lt
Abstract [eng] This work consists of literature analysis and research. The literature part
examines the workings of stream processing systems, way to measure their speed and the …

Automatic rescaling and tuning of big data applications on container-based virtual environments

J Enes - 2020 - ruc.udc.es
Current Big Data applications have significantly evolved from its origins, moving from mostly
batch workloads to more complex ones that may involve many processing stages using …

Satisfying service level objectives in stream processing systems

F Kalim - 2020 - ideals.illinois.edu
An increasing number of real-world applications today consume massive amounts of data in
real-time to produce up to date results. These applications include social media sites that …