Big data resource management & networks: Taxonomy, survey, and future directions

FM Awaysheh, M Alazab, S Garg… - … Surveys & Tutorials, 2021 - ieeexplore.ieee.org
Big Data (BD) platforms have a long tradition of leveraging trends and technologies from the
broader computer network and communication community. For several years, dedicated …

MapReduce: Review and open challenges

IAT Hashem, NB Anuar, A Gani, I Yaqoob, F Xia… - Scientometrics, 2016 - Springer
The continuous increase in computational capacity over the past years has produced an
overwhelming flow of data or big data, which exceeds the capabilities of conventional …

On the optimal recovery threshold of coded matrix multiplication

S Dutta, M Fahim, F Haddadpour… - IEEE Transactions …, 2019 - ieeexplore.ieee.org
We provide novel coded computation strategies for distributed matrix-matrix products that
outperform the recent “Polynomial code” constructions in recovery threshold, ie, the required …

[PDF][PDF] Improving MapReduce performance in heterogeneous environments.

M Zaharia, A Konwinski, AD Joseph, RH Katz, I Stoica - Osdi, 2008 - usenix.org
MapReduce is emerging as an important programming model for large-scale data-parallel
applications such as web indexing, data mining, and scientific simulation. Hadoop is an …

Collaborative learning based straggler prevention in large‐scale distributed computing framework

S Deshmukh, K Thirupathi Rao… - Security and …, 2021 - Wiley Online Library
Modern big data applications tend to prefer a cluster computing approach as they are linked
to the distributed computing framework that serves users jobs as per demand. It performs …

Adaptive resource provisioning for the cloud using online bin packing

W Song, Z Xiao, Q Chen, H Luo - IEEE Transactions on …, 2013 - ieeexplore.ieee.org
Data center applications present significant opportunities for multiplexing server resources.
Virtualization technology makes it easy to move running application across physical …

A speculative approach to spatial‐temporal efficiency with multi‐objective optimization in a heterogeneous cloud environment

Q Liu, W Cai, J Shen, Z Fu, X Liu… - Security and …, 2016 - Wiley Online Library
A heterogeneous cloud system, for example, a Hadoop 2.6. 0 platform, provides distributed
but cohesive services with rich features on large‐scale management, reliability, and error …

Libra: Lightweight data skew mitigation in mapreduce

Q Chen, J Yao, Z Xiao - IEEE Transactions on parallel and …, 2014 - ieeexplore.ieee.org
MapReduce is an effective tool for parallel data processing. One significant issue in practical
MapReduce applications is data skew: the imbalance in the amount of data assigned to …

Straggler root-cause and impact analysis for massive-scale virtualized cloud datacenters

P Garraghan, X Ouyang, R Yang… - IEEE Transactions on …, 2016 - ieeexplore.ieee.org
Increased complexity and scale of virtualized distributed systems has resulted in the
manifestation of emergent phenomena substantially affecting overall system performance …

Lotaru: Locally predicting workflow task runtimes for resource management on heterogeneous infrastructures

J Bader, F Lehmann, L Thamsen, U Leser… - Future Generation …, 2024 - Elsevier
Many resource management techniques for task scheduling, energy and carbon efficiency,
and cost optimization in workflows rely on a-priori task runtime knowledge. Building runtime …