Orchestrating big data analysis workflows in the cloud: research challenges, survey, and future directions

M Barika, S Garg, AY Zomaya, L Wang… - ACM Computing …, 2019 - dl.acm.org
Interest in processing big data has increased rapidly to gain insights that can transform
businesses, government policies, and research outcomes. This has led to advancement in …

A survey of data partitioning and sampling methods to support big data analysis

MS Mahmud, JZ Huang, S Salloum… - Big Data Mining and …, 2020 - ieeexplore.ieee.org
Computer clusters with the shared-nothing architecture are the major computing platforms
for big data processing and analysis. In cluster computing, data partitioning and sampling …

[HTML][HTML] A classification framework for straggler mitigation and management in a heterogeneous Hadoop cluster: A state-of-art survey

KL Bawankule, RK Dewang, AK Singh - Journal of King Saud University …, 2022 - Elsevier
Hadoop is the most economical and cheap software framework that allows distributed
storage and parallel processing of more extensive data sets. Hadoop distributed file system …

Data locality-aware and QoS-aware dynamic cloud workflow scheduling in Hadoop for heterogeneous environment

F Ding, M Ma - International Journal of Web and Grid …, 2023 - inderscienceonline.com
Hadoop has become a popular data-parallel computing framework for data-intensive
scientific applications in recent years. Most scientific applications employ workflows to …

Intelligent data compression policy for Hadoop performance optimization

A Ashu, MW Hussain, D Sinha Roy… - Proceedings of the 11th …, 2021 - Springer
Hadoop can deal with Zeta-level data, but the huge request for Disk I/O and Network
utilization often appears as the limitations in Hadoop. During different job execution phases …

A counter based approach for reducer placement with augmented Hadoop rackawareness

MW Hussain, KH REDDY… - Turkish Journal of …, 2021 - journals.tubitak.gov.tr
As the data-driven paradigm for intelligent systems design is gaining prominence,
performance requirements have become very stringent, leading to numerous fine-tuned …

Modeling and assessing reliability of service-oriented internet of things

RK Behera, KHK Reddy… - International Journal of …, 2019 - Taylor & Francis
In recent years, technological innovations in the fields of sensing, computing, and
communication have seen unprecedented advancements. Particularly, the explosive …

Intelligent data placement in heterogeneous hadoop cluster

SS Paik, RS Goswami, DS Roy, KH Reddy - Smart and Innovative Trends …, 2018 - Springer
The MapReduce programming model and Hadoop has become the de facto standard for
data-intensive applications. Hadoop tasks are mapped to certain nodes within the Hadoop …

A counter-based profiling scheme for improving locality through data and reducer placement

MW Hussain, DS Roy - Advances in Machine Learning for Big Data …, 2022 - Springer
Hadoop has been regarded as the de-facto standard for handling data-intensive distributed
applications with its popular storage and processing engine called as the Hadoop …

Enabling indirect link discovery between SDN switches

MW Hussain, D Sinha Roy - … of the International Conference on Computing …, 2021 - Springer
The removal of the control plane from a Software Defined Network (SDN) helps avoid
flexibility issues that exist in the traditional networks thus enabling SDN to leverage more …