GARLSched: Generative adversarial deep reinforcement learning task scheduling optimization for large-scale high performance computing systems

J Li, X Zhang, J Wei, Z Ji, Z Wei - Future Generation Computer Systems, 2022 - Elsevier
Efficient task scheduling has become increasingly complex as the number and type of tasks
proliferate and the size of computing resource grows in large-scale distributed high …

A deep reinforcement learning approach to resource management in hybrid clouds harnessing renewable energy and task scheduling

J Zhao, MA Rodríguez, R Buyya - 2021 IEEE 14th International …, 2021 - ieeexplore.ieee.org
The use of cloud computing for delivering application services over the Internet has gained
rapid traction. Since the beginning of the COVID-19 global pandemic, the work from home …

Deep learning-based job placement in distributed machine learning clusters with heterogeneous workloads

Y Bao, Y Peng, C Wu - IEEE/ACM Transactions on Networking, 2022 - ieeexplore.ieee.org
Nowadays, most leading IT companies host a variety of distributed machine learning (ML)
workloads in ML clusters to support AI-driven services, such as speech recognition, machine …

Drag-jdec: A deep reinforcement learning and graph neural network-based job dispatching model in edge computing

Z Yu, W Liu, X Liu, G Wang - 2021 IEEE/ACM 29th International …, 2021 - ieeexplore.ieee.org
The emergence of edge computing eases latency pressure in remote cloud and computing
pressure of terminal devices, providing new solutions for real-time applications. Jobs of end …

AutoSched: An Adaptive Self-configured Framework for Scheduling Deep Learning Training Workloads

W Gao, X Zhang, S Huang, S Guo, P Sun… - Proceedings of the 38th …, 2024 - dl.acm.org
Modern Deep Learning Training (DLT) schedulers in GPU datacenters are designed to be
very sophisticated with many configurations. These configurations need to be adjusted …

SCHED²: Scheduling Deep Learning Training via Deep Reinforcement Learning

Y Luan, X Chen, H Zhao, Z Yang… - 2019 IEEE Global …, 2019 - ieeexplore.ieee.org
Today's companies and organizations build GPU clusters for efficient deep learning training
(DLT). However, the inherent heterogeneity of DLT workloads makes it challenging to …

DCloud: deadline-aware resource allocation for cloud computing jobs

D Li, C Chen, J Guan, Y Zhang… - IEEE transactions on …, 2015 - ieeexplore.ieee.org
With the tremendous growth of cloud computing, it is increasingly critical to provide
quantifiable performance to tenants and to improve resource utilization for the cloud …

[HTML][HTML] Optimizing task offloading and resource allocation in edge-cloud networks: a DRL approach

I Ullah, HK Lim, YJ Seok, YH Han - Journal of Cloud Computing, 2023 - Springer
Edge-cloud computing is an emerging approach in which tasks are offloaded from mobile
devices to edge or cloud servers. However, Task offloading may result in increased energy …

A deep reinforcement learning-based task scheduling algorithm for energy efficiency in data centers

P Song, C Chi, K Ji, Z Liu, F Zhang… - 2021 International …, 2021 - ieeexplore.ieee.org
Cloud data centers provide end-users with a wide range of application scenarios, including
scientific computing, smart grids, etc. The number and size of data centers have rapidly …

Enhancing generalization of computation offloading policies in novel mobile edge computing environments by exploiting experience utility

T Ren, J Niu, Y Qiu - Journal of Systems Architecture, 2022 - Elsevier
Recent years have witnessed the booming development of mobile devices (MDs), along
with the surging popularity of mobile applications. Despite the unceasing progress of MDs …