Job scheduling for big data analytical applications in clouds: A taxonomy study

Y Kang, L Pan, S Liu - Future Generation Computer Systems, 2022 - Elsevier
Cloud environments have provided great help for the development of big data. Today, cloud
environments have become the most active platform for big data analysis applications …

Task scheduling in big data platforms: a systematic literature review

M Soualhia, F Khomh, S Tahar - Journal of Systems and Software, 2017 - Elsevier
Abstract Context: Hadoop, Spark, Storm, and Mesos are very well known frameworks in both
research and industrial communities that allow expressing and processing distributed …

Network-aware locality scheduling for distributed data operators in data centers

L Cheng, Y Wang, Q Liu, DHJ Epema… - … on Parallel and …, 2021 - ieeexplore.ieee.org
Large data centers are currently the mainstream infrastructures for big data processing. As
one of the most fundamental tasks in these environments, the efficient execution of …

Latency-optimal pyramid-based joint communication and computation scheduling for distributed edge computing

Q Chen, K Wang, S Guo, T Shi, J Li… - … -IEEE Conference on …, 2023 - ieeexplore.ieee.org
By combing edge computing and parallel computing, distributed edge computing has
emerged as a new paradigm to accelerate computation at the edge. Considering the …

Dynamic memory-aware scheduling in spark computing environment

Z Tang, A Zeng, X Zhang, L Yang, K Li - Journal of Parallel and Distributed …, 2020 - Elsevier
Scheduling plays an important role in improving the performance of big data-parallel
processing. Spark is an in-memory parallel computing framework that uses a multi-threaded …

Delay-optimal distributed edge computing in wireless edge networks

X Gong - IEEE INFOCOM 2020-IEEE conference on computer …, 2020 - ieeexplore.ieee.org
By integrating edge computing with parallel computing, distributed edge computing (DEC)
makes use of distributed devices in edge networks to perform computing in parallel, which …

VirtCo: joint coflow scheduling and virtual machine placement in cloud data centers

D Shen, J Luo, F Dong, J Zhang - Tsinghua Science and …, 2019 - ieeexplore.ieee.org
Cloud data centers, such as Amazon EC2, host myriad big data applications using Virtual
Machines (VMs). As these applications are communication-intensive, optimizing network …

A hybrid task scheduling scheme for heterogeneous vehicular edge systems

X Chen, N Thomas, T Zhan, J Ding - IEEE Access, 2019 - ieeexplore.ieee.org
Enhanced wireless communication improves the connectivity of vehicular networks in which
vehicles are utilized as infrastructures for communication and computation. Thus, a new …

Cost-based Data Prefetching and Scheduling in Big Data Platforms over Tiered Storage Systems

H Herodotou, E Kakoulli - ACM Transactions on Database Systems, 2023 - dl.acm.org
The use of storage tiering is becoming popular in data-intensive compute clusters due to the
recent advancements in storage technologies. The Hadoop Distributed File System, for …

Energy-efficient scheduling algorithms based on task clustering in heterogeneous spark clusters

W Shi, H Li, J Guan, H Zeng - Parallel Computing, 2022 - Elsevier
Spark is widely used for its fast in-memory processing. It is important to improve energy
efficiency under deadline constrains. In this paper, a Task Performance Clustering of Best …