Big data systems: A software engineering perspective

A Davoudian, M Liu - ACM Computing Surveys (CSUR), 2020 - dl.acm.org
Big Data Systems (BDSs) are an emerging class of scalable software technologies whereby
massive amounts of heterogeneous data are gathered from multiple sources, managed …

A comprehensive survey on parallelization and elasticity in stream processing

H Röger, R Mayer - ACM Computing Surveys (CSUR), 2019 - dl.acm.org
Stream Processing (SP) has evolved as the leading paradigm to process and gain value
from the high volume of streaming data produced, eg, in the domain of the Internet of Things …

Faasm: Lightweight isolation for efficient stateful serverless computing

S Shillaker, P Pietzuch - … Annual Technical Conference (USENIX ATC 20 …, 2020 - usenix.org
Serverless computing is an excellent fit for big data processing because it can scale quickly
and cheaply to thousands of parallel functions. Existing serverless platforms isolate …

Samza: stateful scalable stream processing at LinkedIn

SA Noghabi, K Paramasivam, Y Pan… - Proceedings of the …, 2017 - dl.acm.org
Distributed stream processing systems need to support stateful processing, recover quickly
from failures to resume such processing, and reprocess an entire data stream quickly. We …

Analyzing efficient stream processing on modern hardware

S Zeuch, BD Monte, J Karimov, C Lutz, M Renz… - Proceedings of the …, 2019 - dl.acm.org
Modern Stream Processing Engines (SPEs) process large data volumes under tight latency
constraints. Many SPEs execute processing pipelines using message passing on shared …

Medea scheduling of long running applications in shared production clusters

P Garefalakis, K Karanasos, P Pietzuch… - Proceedings of the …, 2018 - dl.acm.org
The rise in popularity of machine learning, streaming, and latency-sensitive online
applications in shared production clusters has raised new challenges for cluster schedulers …

A survey on the evolution of stream processing systems

M Fragkoulis, P Carbone, V Kalavri, A Katsifodimos - The VLDB Journal, 2024 - Springer
Stream processing has been an active research field for more than 20 years, but it is now
witnessing its prime time due to recent successful efforts by the research community and …

Lightweight asynchronous snapshots for distributed dataflows

P Carbone, G Fóra, S Ewen, S Haridi… - arXiv preprint arXiv …, 2015 - arxiv.org
Distributed stateful stream processing enables the deployment and execution of large scale
continuous computations in the cloud, targeting both low latency and high throughput. One …

Nomad: Mitigating arbitrary cloud side channels via provider-assisted migration

SJ Moon, V Sekar, MK Reiter - Proceedings of the 22nd acm sigsac …, 2015 - dl.acm.org
Recent studies have shown a range of co-residency side channels that can be used to
extract private information from cloud clients. Unfortunately, addressing these side channels …

Rhino: Efficient management of very large distributed state for stream processing engines

B Del Monte, S Zeuch, T Rabl, V Markl - Proceedings of the 2020 ACM …, 2020 - dl.acm.org
Scale-out stream processing engines (SPEs) are powering large big data applications on
high velocity data streams. Industrial setups require SPEs to sustain outages, varying data …