Serverless on machine learning: A systematic mapping study

A Barrak, F Petrillo, F Jaafar - IEEE Access, 2022 - ieeexplore.ieee.org
Machine Learning Operations (MLOps) is an approach to managing the entire lifecycle of a
machine learning model. It has evolved over the last years and has started attracting many …

iGniter: Interference-Aware GPU Resource Provisioning for Predictable DNN Inference in the Cloud

F Xu, J Xu, J Chen, L Chen, R Shang… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
GPUs are essential to accelerating the latency-sensitive deep neural network (DNN)
inference workloads in cloud datacenters. To fully utilize GPU resources, spatial sharing of …

Mxfaas: Resource sharing in serverless environments for parallelism and efficiency

J Stojkovic, T Xu, H Franke, J Torrellas - Proceedings of the 50th Annual …, 2023 - dl.acm.org
Although serverless computing is a popular paradigm, current serverless environments have
high overheads. Recently, it has been shown that serverless workloads frequently exhibit …

AsyFunc: A high-performance and resource-efficient serverless inference system via asymmetric functions

Q Pei, Y Yuan, H Hu, Q Chen, F Liu - … of the 2023 ACM Symposium on …, 2023 - dl.acm.org
Recent advances in deep learning (DL) have spawned various intelligent cloud services
with well-trained DL models. Nevertheless, it is nontrivial to maintain the desired end-to-end …

SIMPPO: A scalable and incremental online learning framework for serverless resource management

H Qiu, W Mao, A Patke, C Wang, H Franke… - Proceedings of the 13th …, 2022 - dl.acm.org
Serverless Function-as-a-Service (FaaS) offers improved programmability for customers, yet
it is not server-" less" and comes at the cost of more complex infrastructure management (eg …

Fisc: a large-scale cloud-native-oriented file system

Q Li, L Chen, X Wang, S Huang, Q Xiang… - … USENIX Conference on …, 2023 - usenix.org
The wide adoption of Cloud Native shifts the boundary between cloud users and CSPs
(Cloud Service Providers) from VM-based infrastructure to container-based applications …

Sustainable serverless computing with cold-start optimization and automatic workflow resource scheduling

S Pan, H Zhao, Z Cai, D Li, R Ma… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
In recent years, serverless computing has garnered significant attention owing to its high
scalability, pay-as-you-go billing model, and efficient resource management provided by …

Optimus: Warming Serverless ML Inference via Inter-Function Model Transformation

Z Hong, J Lin, S Guo, S Luo, W Chen… - Proceedings of the …, 2024 - dl.acm.org
Serverless ML inference is an emerging cloud computing paradigm for low-cost, easy-to-
manage inference services. In serverless ML inference, each call is executed in a container; …

AutoInfer: Self-Driving Management for Resource-Efficient, SLO-Aware Machine= Learning Inference in GPU Clusters

B Cai, Q Guo, X Dong - IEEE Internet of Things Journal, 2022 - ieeexplore.ieee.org
As Internet of Things (IoT) keeps growing, IoT-side intelligence services, such as intelligent
personal assistant, healthcare surveillance, and smart home service, offload more and more …

Hybrid Computing for Interactive Datacenter Applications

P Patel, K Lim, K Jhunjhunwalla, A Martinez… - arXiv preprint arXiv …, 2023 - arxiv.org
Field-Programmable Gate Arrays (FPGAs) are more energy efficient and cost effective than
CPUs for a wide variety of datacenter applications. Yet, for latency-sensitive and bursty …