A survey on scheduling techniques in computing and network convergence

S Tang, Y Yu, H Wang, G Wang, W Chen… - … Surveys & Tutorials, 2023 - ieeexplore.ieee.org
The computing demand for massive applications has led to the ubiquitous deployment of
computing power. This trend results in the urgent need for higher-level computing resource …

Serving heterogeneous machine learning models on {Multi-GPU} servers with {Spatio-Temporal} sharing

S Choi, S Lee, Y Kim, J Park, Y Kwon… - 2022 USENIX Annual …, 2022 - usenix.org
As machine learning (ML) techniques are applied to a widening range of applications, high
throughput ML inference serving has become critical for online services. Such ML inference …

Multi-dimensional resource allocation in distributed data centers using deep reinforcement learning

W Wei, H Gu, K Wang, J Li, X Zhang… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
With the development of edge-cloud computing technologies, distributed data centers (DCs)
have been extensively deployed across the global Internet. Since different …

Deep reinforcement learning enhanced greedy optimization for online scheduling of batched tasks in cloud HPC systems

Y Yang, H Shen - IEEE Transactions on Parallel and …, 2021 - ieeexplore.ieee.org
In a large cloud data center HPC system, a critical problem is how to allocate the submitted
tasks to heterogeneous servers that will achieve the goal of maximizing the system's gain …

Conditional generative model based predicate-aware query approximation

N Sheoran, S Mitra, V Porwal, S Ghetia… - Proceedings of the …, 2022 - ojs.aaai.org
Abstract The goal of Approximate Query Processing (AQP) is to provide very fast but"
accurate enough" results for costly aggregate queries thereby improving user experience in …

Accelerating serverless computing by harvesting idle resources

H Yu, H Wang, J Li, X Yuan, SJ Park - … of the ACM Web Conference 2022, 2022 - dl.acm.org
Serverless computing automates fine-grained resource scaling and simplifies the
development and deployment of online services with stateless functions. However, it is still …

A parallel deep reinforcement learning framework for controlling industrial assembly lines

A Tortorelli, M Imran, F Delli Priscoli, F Liberati - Electronics, 2022 - mdpi.com
Decision-making in a complex, dynamic, interconnected, and data-intensive industrial
environment can be improved with the assistance of machine-learning techniques. In this …

Freyr:Harvesting Idle Resources in Serverless Computing Via Deep Reinforcement Learning

H Yu, H Wang - IEEE Transactions on Parallel and Distributed …, 2024 - ieeexplore.ieee.org
Serverless computing has revolutionized online service development and deployment with
ease-to-use operations, auto-scaling, fine-grained resource allocation, and pay-as-you-go …

Joint Resource Overbooking and Container Scheduling in Edge Computing

Z Tang, F Mou, J Lou, W Jia, Y Wu… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Containers have gained popularity in Edge Computing (EC) networks due to their
lightweight and flexible deployment advantage. In resource-constrained EC environments …

Batch jobs load balancing scheduling in cloud computing using distributional reinforcement learning

T Li, S Ying, Y Zhao, J Shang - IEEE Transactions on Parallel …, 2023 - ieeexplore.ieee.org
In cloud computing, how to reasonably allocate computing resources for batch jobs to
ensure the load balance of dynamic clusters and meet user requests is an important and …