MapReduce optimization using regulated dynamic prioritization

T Sandholm, K Lai - Proceedings of the eleventh international joint …, 2009 - dl.acm.org
We present a system for allocating resources in shared data and compute clusters that
improves MapReduce job scheduling in three ways. First, the system uses regulated and …

MRSim: A discrete event based MapReduce simulator

S Hammoud, M Li, Y Liu, NK Alham… - … Conference on Fuzzy …, 2010 - ieeexplore.ieee.org
Recently MapReduce programming model is becoming popular for large scale data
intensive distributed applications due to its efficiency, simplicity and ease of use. The …

Play it again, simmr!

A Verma, L Cherkasova… - 2011 IEEE International …, 2011 - ieeexplore.ieee.org
A typical MapReduce cluster is shared among different users and multiple applications. A
challenging problem in such shared environments is the ability to efficiently control resource …

Disaggregated FPGAs: Network performance comparison against bare-metal servers, virtual machines and linux containers

J Weerasinghe, F Abel, C Hagleitner… - … Conference on Cloud …, 2016 - ieeexplore.ieee.org
FPGAs (Field Programmable Gate Arrays) are making their way into data centers (DC). They
are used as accelerators to boost the compute power of individual server nodes and to …

MReC4. 5: C4. 5 ensemble classification with MapReduce

G Wu, H Li, X Hu, Y Bi, J Zhang… - 2009 fourth ChinaGrid …, 2009 - ieeexplore.ieee.org
Classification is a significant technique in data mining research and applications. C4. 5 is a
widely used classification method, and ensemble learning adopts a parallel and distributed …

Mrsg–a mapreduce simulator over simgrid

W Kolberg, PB Marcos, JCS Anjos, AKS Miyazaki… - Parallel Computing, 2013 - Elsevier
MapReduce is a parallel programming model to process large datasets, and it was inspired
by the Map and Reduce primitives from functional languages. Its first implementation was …

Modeling performances of concurrent big data applications

A Castiglione, M Gribaudo, M Iacono… - Software: Practice and …, 2015 - Wiley Online Library
Big Data applications are characterized by a non‐negligible number of complex parallel
transactions on a huge amount of data that continuously varies, generally increasing over …

Evaluating mapreduce system performance: A simulation approach

G Wang - 2012 - search.proquest.com
Scale of data generated and processed is exploding in the Big Data era. The MapReduce
system popularized by open-source Hadoop is a powerful tool for the exploding data …

Large-scale SMS messages mining based on map-reduce

T Xia - … Symposium on Computational Intelligence and Design, 2008 - ieeexplore.ieee.org
Mining the popular SMS messages in a short period of time is very valuable. However,
traditional OLAP-based mining method is not suitable for this very large scale dataset. In this …

A 2-tier clustering algorithm with map-reduce

J Zhang, G Wu, H Li, X Hu, X Wu - 2010 Fifth Annual ChinaGrid …, 2010 - ieeexplore.ieee.org
In the field of data mining, clustering is one of the important methods. K-Means is a typical
distance-based clustering algorithm; 2-tier clustering should implement scalable clustering …