Deep learning workload scheduling in gpu datacenters: A survey

Z Ye, W Gao, Q Hu, P Sun, X Wang, Y Luo… - ACM Computing …, 2024 - dl.acm.org
Deep learning (DL) has demonstrated its remarkable success in a wide variety of fields. The
development of a DL model is a time-consuming and resource-intensive procedure. Hence …

Deep learning workload scheduling in gpu datacenters: Taxonomy, challenges and vision

W Gao, Q Hu, Z Ye, P Sun, X Wang, Y Luo… - arXiv preprint arXiv …, 2022 - arxiv.org
Deep learning (DL) shows its prosperity in a wide variety of fields. The development of a DL
model is a time-consuming and resource-intensive procedure. Hence, dedicated GPU …

Tbdb: Token bucket-based dynamic batching for resource scheduling supporting neural network inference in intelligent consumer electronics

H Gao, B Qiu, Y Wang, S Yu, Y Xu… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Consumer electronics such as mobile phones, wearable devices, and vehicle electronics
use many intelligent applications such as voice commands, machine translation, and face …

[HTML][HTML] Workflow performance prediction based on graph structure aware deep attention neural network

J Yu, M Gao, Y Li, Z Zhang, WH Ip, KL Yung - Journal of Industrial …, 2022 - Elsevier
With the rapid growth of cloud computing, efficient operational optimization and resource
scheduling of complex cloud business processes rely on real-time and accurate …

iGniter: Interference-Aware GPU Resource Provisioning for Predictable DNN Inference in the Cloud

F Xu, J Xu, J Chen, L Chen, R Shang… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
GPUs are essential to accelerating the latency-sensitive deep neural network (DNN)
inference workloads in cloud datacenters. To fully utilize GPU resources, spatial sharing of …

Lucid: A non-intrusive, scalable and interpretable scheduler for deep learning training jobs

Q Hu, M Zhang, P Sun, Y Wen, T Zhang - Proceedings of the 28th ACM …, 2023 - dl.acm.org
While recent deep learning workload schedulers exhibit excellent performance, it is arduous
to deploy them in practice due to some substantial defects, including inflexible intrusive …

Cloud-native computing: A survey from the perspective of services

S Deng, H Zhao, B Huang, C Zhang… - Proceedings of the …, 2024 - ieeexplore.ieee.org
The development of cloud computing delivery models inspires the emergence of cloud-
native computing. Cloud-native computing, as the most influential development principle for …

Prediction-based scheduling techniques for cloud data center's workload: a systematic review

S Kashyap, A Singh - Cluster Computing, 2023 - Springer
A cloud data center provides various facilities such as storage, data accessibility, and
running many specific applications on cloud resources. The unpredictable demand for …

QoS-aware co-scheduling for distributed long-running applications on shared clusters

J Zhu, R Yang, X Sun, T Wo, C Hu… - … on Parallel and …, 2022 - ieeexplore.ieee.org
To achieve a high degree of resource utilization, production clusters need to co-schedule
diverse workloads–including both batch analytic jobs with short-lived tasks and long-running …

Performance prediction of deep learning applications training in GPU as a service systems

M Lattuada, E Gianniti, D Ardagna, L Zhang - Cluster Computing, 2022 - Springer
Data analysts predict that the GPU as a service (GPUaaS) market will grow to support 3D
models, animated video processing, gaming, and deep learning model training. The main …