Kubernetes scheduling: Taxonomy, ongoing issues and challenges

C Carrión - ACM Computing Surveys, 2022 - dl.acm.org
Continuous integration enables the development of microservices-based applications using
container virtualization technology. Container orchestration systems such as Kubernetes …

Machine learning-based orchestration of containers: A taxonomy and future directions

Z Zhong, M Xu, MA Rodriguez, C Xu… - ACM Computing Surveys …, 2022 - dl.acm.org
Containerization is a lightweight application virtualization technology, providing high
environmental consistency, operating system distribution portability, and resource isolation …

Planaria: Dynamic architecture fission for spatial multi-tenant acceleration of deep neural networks

S Ghodrati, BH Ahn, JK Kim, S Kinzer… - 2020 53rd Annual …, 2020 - ieeexplore.ieee.org
Deep Neural Networks (DNNs) have reinvigorated real-world applications that rely on
learning patterns of data and are permeating into different industries and markets. Cloud …

Kraken: Adaptive container provisioning for deploying dynamic dags in serverless platforms

VM Bhasi, JR Gunasekaran, P Thinakaran… - Proceedings of the …, 2021 - dl.acm.org
The growing popularity of microservices has led to the proliferation of online cloud service-
based applications, which are typically modelled as Directed Acyclic Graphs (DAGs) …

Cocktail: A multidimensional optimization for model serving in cloud

JR Gunasekaran, CS Mishra, P Thinakaran… - … USENIX Symposium on …, 2022 - usenix.org
With a growing demand for adopting ML models for a variety of application services, it is vital
that the frameworks serving these models are capable of delivering highly accurate …

Fifer: Tackling resource underutilization in the serverless era

JR Gunasekaran, P Thinakaran… - Proceedings of the 21st …, 2020 - dl.acm.org
Datacenters are witnessing a rapid surge in the adoption of serverless functions for
microservices-based applications. A vast majority of these microservices typically span less …

Horus: Interference-aware and prediction-based scheduling in deep learning systems

G Yeung, D Borowiec, R Yang, A Friday… - … on Parallel and …, 2021 - ieeexplore.ieee.org
To accelerate the training of Deep Learning (DL) models, clusters of machines equipped
with hardware accelerators such as GPUs are leveraged to reduce execution time. State-of …

Enable simultaneous dnn services based on deterministic operator overlap and precise latency prediction

W Cui, H Zhao, Q Chen, N Zheng, J Leng… - Proceedings of the …, 2021 - dl.acm.org
While user-facing services experience diurnal load patterns, co-locating services improve
hardware utilization. Prior work on co-locating services on GPUs run queries sequentially …

A survey of Kubernetes scheduling algorithms

K Senjab, S Abbas, N Ahmed, AR Khan - Journal of Cloud Computing, 2023 - Springer
As cloud services expand, the need to improve the performance of data center infrastructure
becomes more important. High-performance computing, advanced networking solutions …

Deep learning workload scheduling in gpu datacenters: A survey

Z Ye, W Gao, Q Hu, P Sun, X Wang, Y Luo… - ACM Computing …, 2024 - dl.acm.org
Deep learning (DL) has demonstrated its remarkable success in a wide variety of fields. The
development of a DL model is a time-consuming and resource-intensive procedure. Hence …