Fifer: Tackling resource underutilization in the serverless era

JR Gunasekaran, P Thinakaran… - Proceedings of the 21st …, 2020 - dl.acm.org
Datacenters are witnessing a rapid surge in the adoption of serverless functions for
microservices-based applications. A vast majority of these microservices typically span less …

Nanily: A qos-aware scheduling for dnn inference workload in clouds

X Tang, P Wang, Q Liu, W Wang… - 2019 IEEE 21st …, 2019 - ieeexplore.ieee.org
DNN inferences are widely emerging as a service and must run in sub-second latency,
which need GPU hardware to achieve parallel accelerating. Prior works to improve the …

AutoSched: An Adaptive Self-configured Framework for Scheduling Deep Learning Training Workloads

W Gao, X Zhang, S Huang, S Guo, P Sun… - Proceedings of the 38th …, 2024 - dl.acm.org
Modern Deep Learning Training (DLT) schedulers in GPU datacenters are designed to be
very sophisticated with many configurations. These configurations need to be adjusted …

It's a Scheduling Affair: GROMACS in the Cloud with the KubeFlux Scheduler

C Misale, M Drocco, DJ Milroy… - … on Containers and …, 2021 - ieeexplore.ieee.org
In this work, we address the problem of running HPC workloads efficiently on Kubernetes
clusters. To do so, we compare the Kubernetes' default scheduler with KubeFlux, a …

Pipe-torch: Pipeline-based distributed deep learning in a gpu cluster with heterogeneous networking

J Zhan, J Zhang - … Conference on Advanced Cloud and Big …, 2019 - ieeexplore.ieee.org
Because training a deep neural network (DNN) takes arduous amounts of time and
computation, often researchers expedite the training process via distributed parallel training …

Preemptive and low latency datacenter scheduling via lightweight containers

W Chen, X Zhou, J Rao - IEEE Transactions on Parallel and …, 2019 - ieeexplore.ieee.org
Datacenters are evolving to host heterogeneous workloads on shared clusters to reduce the
operational cost and achieve higher resource utilization. However, it is challenging to …

iGniter: Interference-Aware GPU Resource Provisioning for Predictable DNN Inference in the Cloud

F Xu, J Xu, J Chen, L Chen, R Shang… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
GPUs are essential to accelerating the latency-sensitive deep neural network (DNN)
inference workloads in cloud datacenters. To fully utilize GPU resources, spatial sharing of …

[PDF][PDF] Multi-tenant GPU clusters for deep learning workloads: Analysis and implications

M Jeon, S Venkataraman, J Qian… - Technical report …, 2018 - microsoft.com
With widespread advances in machine learning, a number of large enterprises are
beginning to incorporate machine learning models across a number of products. These …

Looking beyond {GPUs} for {DNN} scheduling on {Multi-Tenant} clusters

J Mohan, A Phanishayee, J Kulkarni… - … USENIX Symposium on …, 2022 - usenix.org
Training Deep Neural Networks (DNNs) is a popular workload in both enterprises and cloud
data centers. Existing schedulers for DNN training consider GPU as the dominant resource …

Astraea: A fair deep learning scheduler for multi-tenant gpu clusters

Z Ye, P Sun, W Gao, T Zhang, X Wang… - … on Parallel and …, 2021 - ieeexplore.ieee.org
Modern GPU clusters are designed to support distributed Deep Learning jobs from multiple
tenants concurrently. Each tenant may have varied and dynamic resource demands …