A survey of big data, high performance computing, and machine learning benchmarks

N Ihde, P Marten, A Eleliemy, G Poerwawinata… - … and Benchmarking …, 2022 - Springer
… BigDataBench contains a BD Generator Suite (BDGS) [51] that can generate synthetic … The
scalable science benchmarks are expected to run at full scale of the CORAL systems [2], and …

Tutorial on Benchmarking Big Data Analytics Systems

T Ivanov, R Singhal - Companion of the ACM/SPEC International …, 2020 - dl.acm.org
… questions on relevant Big Data and Analytics benchmarks. … We don’t know enough to
make a big data benchmark suite-an … BDGS: A Scalable Big Data Generator Suite in Big Data

Scalability and performance analysis of BDPS in clouds

Y Li, D Ou, X Zhou, C Jiang, C Cérin - Computing, 2022 - Springer
… will generate a huge amount of data [49]. In order to exploit this big data, a variety of big data
… Moreover, relative few benchmarks were proposed to evaluate big data processing systems…

TextBenDS: a generic textual data benchmark for distributed systems

CO Truică, ES Apostol, J Darmont, I Assent - Information Systems …, 2021 - Springer
… Besides TextBenDS’ features, ie, scalability and genericity, we … In Section 2, we present a
survey on existing big data and, … BDGS covers three representative data types (structured, semi…

Benchmarking graph data management and processing systems: A survey

M Dayarathna, T Suzumura - arXiv preprint arXiv:2005.12873, 2020 - arxiv.org
… Furthermore, the data generator should be scalable in such a way that … Big Data Generator
Suite (BDGS) to generate synthetic data while protecting their inherent characteristics. BDGS

Big Data Benchmarking and Performance Optimization on Cloud Infrastructure

A Ricci, V Cortiana - International Journal of Social Analytics, 2022 - norislab.com
… Explicit considerations like scalability, elasticity, and fault-tolerance are vital for big data
The benchmark's ability to generate synthetic datasets with configurable parameters, such …

Generalizing streaming pipeline design for big data

K Rengarajan, VK Menon - Machine Intelligence and Signal Processing …, 2020 - Springer
Big data analytics has become an inevitable infrastructure tool for every one generating or
amassing large volumes of data at … benchmarked Apache-hosted software platforms for data

Automated translation of functional big data queries to SQL

G Zhang, B Mariano, X Shen, I Dillig - Proceedings of the ACM on …, 2023 - dl.acm.org
… , RDD2SQL, on a benchmark of real-world Spark RDD … Our approach alleviates the
scalability bottleneck of CEGIS … manipulates unordered sets or bags, our relational language …

A Comparative Analysis of Garbage Collectors and Their Suitability for Big Data Workloads

A Nair, A Sriram, A Simon, S Kalambur… - Advances in Computing …, 2021 - Springer
data generation tools like BDGS, which generates synthetic data by scaling the real seed
data, … In our results, we first see how the Dacapo benchmarks and big data benchmarks vary in …

The LDBC Graphalytics Benchmark

A Iosup, A Musaafir, A Uta, AP Pérez… - arXiv preprint arXiv …, 2020 - arxiv.org
… quantify multiple kinds of systems scalability, weak and strong… Compared to traditional
benchmarking, benchmarking graph … for generating diverse yet controlled datasets at large scale, …