A comprehensive view of Hadoop research—A systematic literature review

I Polato, R Ré, A Goldman, F Kon - Journal of Network and Computer …, 2014 - Elsevier
Context: In recent years, the valuable knowledge that can be retrieved from petabyte scale
datasets–known as Big Data–led to the development of solutions to process information …

A survey of data partitioning and sampling methods to support big data analysis

MS Mahmud, JZ Huang, S Salloum… - Big Data Mining and …, 2020 - ieeexplore.ieee.org
Computer clusters with the shared-nothing architecture are the major computing platforms
for big data processing and analysis. In cluster computing, data partitioning and sampling …

Silicon debug: scan chains alone are not enough

GJ Van Rootselaar, B Vermeulen - … Test Conference 1999 …, 1999 - ieeexplore.ieee.org
For today's multi-million transistor designs, existing design verification techniques cannot
guarantee that first silicon is designed error free. Therefore, techniques are necessary to …

A comparative review of job scheduling for MapReduce

D Yoo, KM Sim - 2011 IEEE International Conference on Cloud …, 2011 - ieeexplore.ieee.org
MapReduce is an emerging paradigm for data intensive processing with support of cloud
computing technology. MapReduce provides convenient programming interfaces to …

Map-Balance-Reduce: An improved parallel programming model for load balancing of MapReduce

J Li, Y Liu, J Pan, P Zhang, W Chen, L Wang - Future Generation Computer …, 2020 - Elsevier
With the advent of the era of big data, the demand of massive data processing applications
is also growing. Currently, MapReduce is the most commonly used data processing …

StreamMR: an optimized MapReduce framework for AMD GPUs

M Elteir, H Lin, W Feng… - 2011 IEEE 17th …, 2011 - ieeexplore.ieee.org
MapReduce is a programming model from Google that facilitates parallel processing on a
cluster of thousands of commodity computers. The success of MapReduce in cluster …

A two-stage data processing algorithm to generate random sample partitions for big data analysis

C Wei, S Salloum, TZ Emara, X Zhang… - … Conference on Cloud …, 2018 - Springer
To enable the individual data block files of a distributed big data set to be used as random
samples for big data analysis, a two-stage data processing (TSDP) algorithm is proposed in …

An asymptotic ensemble learning framework for big data analysis

S Salloum, JZ Huang, Y He, X Chen - IEEE Access, 2018 - ieeexplore.ieee.org
In order to enable big data analysis when data volume goes beyond the available
computing resources, we propose a new method for big data analysis. This method uses …

Stream as you go: The case for incremental data access and processing in the cloud

R Kienzler, R Bruggmann… - 2012 IEEE 28th …, 2012 - ieeexplore.ieee.org
Cloud infrastructures promise to provide high-performance and cost-effective solutions to
large-scale data processing problems. In this paper, we identify a common class of data …

Hierarchical mapreduce programming model and scheduling algorithms

Y Luo, B Plale - 2012 12th IEEE/ACM International Symposium …, 2012 - ieeexplore.ieee.org
We present a Hierarchical MapReduce framework that gathers computation resources from
different clusters and runs MapReduce jobs across them. The applications implemented in …