Realistic and scalable benchmarking cloud file systems: Practices and lessons from AliCloud

Z Ren, W Shi, J Wan, F Cao, J Lin - IEEE Transactions on …, 2017 - ieeexplore.ieee.org
The past decade has witnessed the rapid boom of cloud computing. Many public cloud
infrastructures have been implemented and serve millions of tenants. Cloud file systems …

[PDF][PDF] BigDataBench: a dwarf-based big data and AI benchmark suite

W Gao, J Zhan, L Wang, C Luo, D Zheng… - arXiv preprint arXiv …, 2018 - benchcouncil.org
BigDataBench: A Dwarf-based Big Data and AI Benchmark Suite Page 1 IN STITU TE O F C
O M P U TIN GT E C H N O L O GY BigDataBench: A Dwarf-based Big Data and AI …

Bigop: Generating comprehensive big data workloads as a benchmarking framework

Y Zhu, J Zhan, C Weng, R Nambiar, J Zhang… - Database Systems for …, 2014 - Springer
Big Data is considered proprietary asset of companies, organizations, and even nations.
Turning big data into real treasure requires the support of big data systems. A variety of …

Architectural impact on performance of in-memory data analytics: Apache spark case study

AJ Awan, M Brorsson, V Vlassov, E Ayguade - arXiv preprint arXiv …, 2016 - arxiv.org
While cluster computing frameworks are continuously evolving to provide real-time data
analysis capabilities, Apache Spark has managed to be at the forefront of big data analytics …

An “on the fly” framework for efficiently generating synthetic big data sets

K Mason, S Vejdan, S Grijalva - 2019 IEEE International …, 2019 - ieeexplore.ieee.org
Collecting, analyzing and gaining insight from large volumes of data is now the norm in an
ever increasing number of industries. Data analytics techniques, such as machine learning …

Plug and play bench: Simplifying big data benchmarking using containers

S Ceesay, A Barker, B Varghese - 2017 IEEE International …, 2017 - ieeexplore.ieee.org
The recent boom of big data, coupled with the challenges of its processing and storage gave
rise to the development of distributed data processing and storage paradigms like …

Identifying the potential of near data processing for apache spark

AJ Awan, M Ohara, E Ayguadé, K Ishizaki… - Proceedings of the …, 2017 - dl.acm.org
While cluster computing frameworks are continuously evolving to provide real-time data
analysis capabilities, Apache Spark has managed to be at the forefront of big data analytics …

Node architecture implications for in-memory data analytics on scale-in clusters

AJ Awan, V Vlassov, M Brorsson… - Proceedings of the 3rd …, 2016 - dl.acm.org
While cluster computing frameworks are continuously evolving to provide real-time data
analysis capabilities, Apache Spark has managed to be at the forefront of big data analytics …

Quantifying the performance impact of large pages on in-memory big-data workloads

J Park, M Han, W Baek - 2016 IEEE International Symposium …, 2016 - ieeexplore.ieee.org
In-memory big-data processing is rapidly emerging as a promising solution for large-scale
data analytics with high-performance and/or real-time requirements. In-memory big-data …

Bigdatabench-mt: A benchmark tool for generating realistic mixed data center workloads

R Han, S Zhan, C Shao, J Wang, LK John, J Xu… - Big Data Benchmarks …, 2016 - Springer
Long-running service workloads (eg web search engine) and short-term data analysis
workloads (eg Hadoop MapReduce jobs) co-locate in today's data centers. Developing …