A comprehensive view of Hadoop research—A systematic literature review

I Polato, R Ré, A Goldman, F Kon - Journal of Network and Computer …, 2014 - Elsevier
Context: In recent years, the valuable knowledge that can be retrieved from petabyte scale
datasets–known as Big Data–led to the development of solutions to process information …

[HTML][HTML] Analysis of hadoop MapReduce scheduling in heterogeneous environment

K Kalia, N Gupta - Ain Shams Engineering Journal, 2021 - Elsevier
Over the last decade, several advancements have happened in distributed and parallel
computing. A lot of data is generated daily from various sources, and this speedy data …

MapReduce 并行编程模型研究综述

李建江, 崔健, 王聃, 严林, 黄义双 - 电子学报, 2011 - ejournal.org.cn
MapReduce 并行编程模型通过定义良好的接口和运行时支持库, 能够自动并行执行大规模计算
任务, 隐藏底层实现细节, 降低并行编程的难度. 本文对MapReduce 的国内外相关研究现状进行 …

Tarazu: optimizing mapreduce on heterogeneous clusters

F Ahmad, ST Chakradhar, A Raghunathan… - ACM SIGARCH …, 2012 - dl.acm.org
Data center-scale clusters are evolving towards heterogeneous hardware for power, cost,
differentiated price-performance, and other reasons. MapReduce is a well-known …

Matchmaking: A new mapreduce scheduling technique

C He, Y Lu, D Swanson - 2011 IEEE Third International …, 2011 - ieeexplore.ieee.org
MapReduce is a powerful platform for large-scale data processing. To achieve good
performance, a MapReduce scheduler must avoid unnecessary data transmission by …

Erasure code replication revisited

WK Lin, DM Chiu, YB Lee - … on Peer-to-Peer Computing, 2004 …, 2004 - ieeexplore.ieee.org
Erasure coding is a technique for achieving high availability and reliability in storage and
communication systems. We revisit the analysis of erasure code replication and point out …

A review on big data real-time stream processing and its scheduling techniques

N Tantalaki, S Souravlas… - International Journal of …, 2020 - Taylor & Francis
Over the last decade, several interconnected disruptions have happened in the large scale
distributed and parallel computing landscape. The volume of data currently produced by …

Straggler root-cause and impact analysis for massive-scale virtualized cloud datacenters

P Garraghan, X Ouyang, R Yang… - IEEE Transactions on …, 2016 - ieeexplore.ieee.org
Increased complexity and scale of virtualized distributed systems has resulted in the
manifestation of emergent phenomena substantially affecting overall system performance …

[HTML][HTML] Hadoop based defense solution to handle distributed denial of service (ddos) attacks

S Tripathi, B Gupta, A Almomani, A Mishra, S Veluru - 2013 - scirp.org
Distributed denial of service (DDoS) attacks continues to grow as a threat to organizations
worldwide. From the first known attack in 1999 to the highly publicized Operation Ababil, the …

MapReduce parallel programming model: a state-of-the-art survey

R Li, H Hu, H Li, Y Wu, J Yang - International Journal of Parallel …, 2016 - Springer
With the development of information technologies, we have entered the era of Big Data.
Google's MapReduce programming model and its open-source implementation in Apache …