Spotserve: Serving generative large language models on preemptible instances

X Miao, C Shi, J Duan, X Xi, D Lin, B Cui… - Proceedings of the 29th …, 2024 - dl.acm.org
The high computational and memory requirements of generative large language models
(LLMs) make it challenging to serve them cheaply. This paper aims to reduce the monetary …

Parcae: Proactive,{Liveput-Optimized}{DNN} Training on Preemptible Instances

J Duan, Z Song, X Miao, X Xi, D Lin, H Xu… - … USENIX Symposium on …, 2024 - usenix.org
Deep neural networks (DNNs) are becoming progressively large and costly to train. This
paper aims to reduce DNN training costs by leveraging preemptible instances on modern …

How different are the cloud workloads? characterizing large-scale private and public cloud workloads

X Qin, M Ma, Y Zhao, J Zhang, C Du… - 2023 53rd Annual …, 2023 - ieeexplore.ieee.org
With the rapid development of cloud systems, an increasing number of service workloads
are deployed in the private cloud and/or public cloud. Although large cloud providers such …

Going Green for Less Green: Optimizing the Cost of Reducing Cloud Carbon Emissions

WA Hanafy, Q Liang, N Bashir, A Souza… - Proceedings of the 29th …, 2024 - dl.acm.org
The continued exponential growth of cloud datacenter capacity has increased awareness of
the carbon emissions when executing large compute-intensive workloads. To reduce carbon …

Making Cloud Spot Instance Interruption Events Visible

KH Kim, K Lee - Proceedings of the ACM on Web Conference 2024, 2024 - dl.acm.org
Public cloud computing providers offer a surplus of computing resources at a lower price
with a service of a spot instance. Despite the possible great cost savings from using spot …

Workload-Aware Live Migratable Cloud Instance Detector

J Lim, KH Kim, K Lee - 2024 IEEE 24th International …, 2024 - ieeexplore.ieee.org
Cloud computing provides a variety of distinct computing resources on demand. Supporting
live migration in the cloud can be beneficial to dynamically build a reliable and cost-optimal …

Microless: Cost-Efficient Hybrid Deployment of Microservices on IaaS VMs and Serverless

J Cheng, Y Zhao, Z Li, Q Chen, W Cui… - 2023 IEEE 29th …, 2023 - ieeexplore.ieee.org
Microservices have gained popularity as an architectural approach for developing scalable
and modular applications. Traditionally, microservice deployment relies on virtual machines …

Sky Computing with Intercloud Brokers

Z Wu - 2024 - search.proquest.com
In an era where digital infrastructure increasingly relies on cloud computing, the need for
flexible workload migration across clouds has become crucial. This need is particularly …

[PDF][PDF] Serving with Spot GPUs in the Sky

Z Mao, T Griggs - people.eecs.berkeley.edu
Recent years have witnessed an explosive growth of large AI models, characterized by the
high cost of hosting AI services, and their demanding service requirement. We explore two …