A comprehensive performance analysis of Apache Hadoop and Apache Spark for large scale data sets using HiBench

N Ahmed, ALC Barczak, T Susnjak, MA Rashid - Journal of Big Data, 2020 - Springer
Big Data analytics for storing, processing, and analyzing large-scale datasets has become
an essential tool for the industry. The advent of distributed computing frameworks such as …

A general perspective of Big Data: applications, tools, challenges and trends

L Rodríguez-Mazahua, CA Rodríguez-Enríquez… - The Journal of …, 2016 - Springer
Big Data has become a very popular term. It refers to the enormous amount of structured,
semi-structured and unstructured data that are exponentially generated by high …

Sparkbench: a comprehensive benchmarking suite for in memory data analytic platform spark

M Li, J Tan, Y Wang, L Zhang, V Salapura - Proceedings of the 12th ACM …, 2015 - dl.acm.org
Spark has been increasingly adopted by industries in recent years for big data analysis by
providing a fault tolerant, scalable and easy-to-use in memory abstraction. Moreover, the …

μ suite: a benchmark suite for microservices

A Sriraman, TF Wenisch - 2018 ieee international symposium …, 2018 - ieeexplore.ieee.org
Modern On-Line Data Intensive (OLDI) applications have evolved from monolithic systems to
instead comprise numerous, distributed microservices interacting via Remote Procedure …

The architectural implications of cloud microservices

Y Gan, C Delimitrou - IEEE Computer Architecture Letters, 2018 - ieeexplore.ieee.org
Cloud services have recently undergone a shift from monolithic applications to
microservices, with hundreds or thousands of loosely-coupled microservices comprising the …

GraphBIG: understanding graph computing in the context of industrial solutions

L Nai, Y Xia, IG Tanase, H Kim, CY Lin - Proceedings of the International …, 2015 - dl.acm.org
With the emergence of data science, graph computing is becoming a crucial tool for
processing big connected data. Although efficient implementations of specific graph …

Taming performance variability

A Maricq, D Duplyakin, I Jimenez, C Maltzahn… - … USENIX Symposium on …, 2018 - usenix.org
The performance of compute hardware varies: software run repeatedly on the same server
(or a different server with supposedly identical parts) can produce performance results that …

What your DRAM power models are not telling you: Lessons from a detailed experimental study

S Ghose, AG Yaglikçi, R Gupta, D Lee… - Proceedings of the …, 2018 - dl.acm.org
Main memory (DRAM) consumes as much as half of the total system power in a computer
today, due to the increasing demand for memory capacity and bandwidth. There is a …

{FlashShare}: Punching Through Server Storage Stack from Kernel to Firmware for {Ultra-Low} Latency {SSDs}

J Zhang, M Kwon, D Gouk, S Koh, C Lee… - … USENIX Symposium on …, 2018 - usenix.org
A modern datacenter server aims to achieve high energy efficiency by co-running multiple
applications. Some of such applications (eg, web search) are latency sensitive. Therefore …

Understanding, predicting and scheduling serverless workloads under partial interference

L Zhao, Y Yang, Y Li, X Zhou, K Li - Proceedings of the International …, 2021 - dl.acm.org
Interference among distributed cloud applications can be classified into three types: full,
partial and zero. While prior research merely focused on full interference, the partial …