Thinking like a vertex: A survey of vertex-centric frameworks for large-scale distributed graph processing

RR McCune, T Weninger, G Madey - ACM Computing Surveys (CSUR), 2015 - dl.acm.org
The vertex-centric programming model is an established computational paradigm recently
incorporated into distributed processing frameworks to address challenges in large-scale …

Big Data with Cloud Computing: an insight on the computing environment, MapReduce, and programming frameworks

A Fernández, S del Río, V López… - … : Data Mining and …, 2014 - Wiley Online Library
The term 'Big Data'has spread rapidly in the framework of Data Mining and Business
Intelligence. This new scenario can be defined by means of those problems that cannot be …

[图书][B] Principles of distributed database systems

MT Özsu, P Valduriez - 1999 - Springer
The first edition of this book appeared in 1991 when the technology was new and there were
not too many products. In the Preface to the first edition, we had quoted Michael Stonebraker …

A scalable two-phase top-down specialization approach for data anonymization using mapreduce on cloud

X Zhang, LT Yang, C Liu, J Chen - IEEE Transactions on …, 2013 - ieeexplore.ieee.org
A large number of cloud services require users to share private data like electronic health
records for data analysis or mining, bringing privacy concerns. Anonymizing data sets via …

A survey of large-scale analytical query processing in MapReduce

C Doulkeridis, K Nørvåg - The VLDB journal, 2014 - Springer
Enterprises today acquire vast volumes of data from different sources and leverage this
information by means of data analysis to support effective decision-making and provide new …

Big data analytics with datalog queries on spark

A Shkapsky, M Yang, M Interlandi, H Chiu… - Proceedings of the …, 2016 - dl.acm.org
There is great interest in exploiting the opportunity provided by cloud computing platforms
for large-scale analytics. Among these platforms, Apache Spark is growing in popularity for …

An experimental survey on big data frameworks

W Inoubli, S Aridhi, H Mezni, M Maddouri… - Future Generation …, 2018 - Elsevier
Recently, increasingly large amounts of data are generated from a variety of sources.
Existing data processing technologies are not suitable to cope with the huge amounts of …

A comprehensive view of Hadoop research—A systematic literature review

I Polato, R Ré, A Goldman, F Kon - Journal of Network and Computer …, 2014 - Elsevier
Context: In recent years, the valuable knowledge that can be retrieved from petabyte scale
datasets–known as Big Data–led to the development of solutions to process information …

{ShuffleWatcher}: Shuffle-aware scheduling in multi-tenant {MapReduce} clusters

F Ahmad, ST Chakradhar, A Raghunathan… - 2014 USENIX Annual …, 2014 - usenix.org
MapReduce clusters are usually multi-tenant (ie, shared among multiple users and jobs) for
improving cost and utilization. The performance of jobs in a multi-tenant MapReduce cluster …

Survey of distributed computing frameworks for supporting big data analysis

X Sun, Y He, D Wu, JZ Huang - Big Data Mining and Analytics, 2023 - ieeexplore.ieee.org
Distributed computing frameworks are the fundamental component of distributed computing
systems. They provide an essential way to support the efficient processing of big data on …