A survey of big data, high performance computing, and machine learning benchmarks

N Ihde, P Marten, A Eleliemy, G Poerwawinata… - … and Benchmarking …, 2022 - Springer
… BigDataBench contains a BD Generator Suite (BDGS) [51] that can generate synthetic … The
scalable science benchmarks are expected to run at full scale of the CORAL systems [2], and …

Tutorial on Benchmarking Big Data Analytics Systems

T Ivanov, R Singhal - Companion of the ACM/SPEC International …, 2020 - dl.acm.org
… questions on relevant Big Data and Analytics benchmarks. … We don’t know enough to
make a big data benchmark suite-an … BDGS: A Scalable Big Data Generator Suite in Big Data

Scalability and performance analysis of BDPS in clouds

Y Li, D Ou, X Zhou, C Jiang, C Cérin - Computing, 2022 - Springer
… will generate a huge amount of data [49]. In order to exploit this big data, a variety of big data
… Moreover, relative few benchmarks were proposed to evaluate big data processing systems…

An “on the fly” framework for efficiently generating synthetic big data sets

K Mason, S Vejdan, S Grijalva - … on Big Data (Big Data), 2019 - ieeexplore.ieee.org
… and storing data is a key issue faced when handling large data sets. This … , “Bdgs: A scalable
big data generator suite in big data benchmarking,” in Workshop on Big Data Benchmarks. …

Midbench: Multimodel industrial big data benchmark

Y Cheng, M Cheng, H Ge, Y Guo, Y Hao, X Sun… - Benchmarking …, 2019 - Springer
Benchmark Suite (TSBS) [8] was designed for benchmarking … ] is a comprehensive benchmark
which consists of BDGS [24] … generate three types of scalable datasets such as BoM data, …

TextBenDS: a generic textual data benchmark for distributed systems

CO Truică, ES Apostol, J Darmont, I Assent - Information Systems …, 2021 - Springer
… Besides TextBenDS’ features, ie, scalability and genericity, we … In Section 2, we present a
survey on existing big data and, … BDGS covers three representative data types (structured, semi…

Benchmarking graph data management and processing systems: A survey

M Dayarathna, T Suzumura - arXiv preprint arXiv:2005.12873, 2020 - arxiv.org
… Furthermore, the data generator should be scalable in such a way that … Big Data Generator
Suite (BDGS) to generate synthetic data while protecting their inherent characteristics. BDGS

Big Data Benchmarking and Performance Optimization on Cloud Infrastructure

A Ricci, V Cortiana - International Journal of Social Analytics, 2022 - norislab.com
… Explicit considerations like scalability, elasticity, and fault-tolerance are vital for big data
The benchmark's ability to generate synthetic datasets with configurable parameters, such …

[PDF][PDF] Scalable unified data analytics

A Watson - 2019 - unbscholar.lib.unb.ca
… performance as other big data systems with similar capabilities. … , we introduce a data science
benchmark called Sanzu. Our … bags (to support unstructured data). Dask extends common …

Generalizing streaming pipeline design for big data

K Rengarajan, VK Menon - Machine Intelligence and Signal Processing …, 2020 - Springer
Big data analytics has become an inevitable infrastructure tool for every one generating or
amassing large volumes of data at … benchmarked Apache-hosted software platforms for data