INFless: a native serverless system for low-latency, high-throughput inference

A Barrak, F Petrillo, F Jaafar - IEEE Access, 2022 - ieeexplore.ieee.org

Machine Learning Operations (MLOps) is an approach to managing the entire lifecycle of a
machine learning model. It has evolved over the last years and has started attracting many …

被引用次数：24 相关文章所有 3 个版本

[PDF] arxiv.org

iGniter: Interference-Aware GPU Resource Provisioning for Predictable DNN Inference in the Cloud

F Xu, J Xu, J Chen, L Chen, R Shang… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

GPUs are essential to accelerating the latency-sensitive deep neural network (DNN)
inference workloads in cloud datacenters. To fully utilize GPU resources, spatial sharing of …

被引用次数：16 相关文章所有 5 个版本

[PDF] github.io

Mxfaas: Resource sharing in serverless environments for parallelism and efficiency

J Stojkovic, T Xu, H Franke, J Torrellas - Proceedings of the 50th Annual …, 2023 - dl.acm.org

Although serverless computing is a popular paradigm, current serverless environments have
high overheads. Recently, it has been shown that serverless workloads frequently exhibit …

被引用次数：9 相关文章所有 6 个版本

[PDF] github.io

AsyFunc: A high-performance and resource-efficient serverless inference system via asymmetric functions

Q Pei, Y Yuan, H Hu, Q Chen, F Liu - … of the 2023 ACM Symposium on …, 2023 - dl.acm.org

Recent advances in deep learning (DL) have spawned various intelligent cloud services
with well-trained DL models. Nevertheless, it is nontrivial to maintain the desired end-to-end …

被引用次数：8 相关文章所有 2 个版本

[PDF] nsf.gov

SIMPPO: A scalable and incremental online learning framework for serverless resource management

H Qiu, W Mao, A Patke, C Wang, H Franke… - Proceedings of the 13th …, 2022 - dl.acm.org

Serverless Function-as-a-Service (FaaS) offers improved programmability for customers, yet
it is not server-" less" and comes at the cost of more complex infrastructure management (eg …

被引用次数：14 相关文章所有 7 个版本

[PDF] usenix.org

Fisc: a large-scale cloud-native-oriented file system

Q Li, L Chen, X Wang, S Huang, Q Xiang… - … USENIX Conference on …, 2023 - usenix.org

The wide adoption of Cloud Native shifts the boundary between cloud users and CSPs
(Cloud Service Providers) from VM-based infrastructure to container-based applications …

被引用次数：6 相关文章所有 7 个版本

Sustainable serverless computing with cold-start optimization and automatic workflow resource scheduling

S Pan, H Zhao, Z Cai, D Li, R Ma… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

In recent years, serverless computing has garnered significant attention owing to its high
scalability, pay-as-you-go billing model, and efficient resource management provided by …

被引用次数：8 相关文章所有 2 个版本

[PDF] github.io

Optimus: Warming Serverless ML Inference via Inter-Function Model Transformation

Z Hong, J Lin, S Guo, S Luo, W Chen… - Proceedings of the …, 2024 - dl.acm.org

Serverless ML inference is an emerging cloud computing paradigm for low-cost, easy-to-
manage inference services. In serverless ML inference, each call is executed in a container; …

被引用次数：1 相关文章所有 4 个版本

AutoInfer: Self-Driving Management for Resource-Efficient, SLO-Aware Machine= Learning Inference in GPU Clusters

B Cai, Q Guo, X Dong - IEEE Internet of Things Journal, 2022 - ieeexplore.ieee.org

As Internet of Things (IoT) keeps growing, IoT-side intelligence services, such as intelligent
personal assistant, healthcare surveillance, and smart home service, offload more and more …

被引用次数：3 相关文章

[PDF] arxiv.org

Hybrid Computing for Interactive Datacenter Applications

P Patel, K Lim, K Jhunjhunwalla, A Martinez… - arXiv preprint arXiv …, 2023 - arxiv.org

Field-Programmable Gate Arrays (FPGAs) are more energy efficient and cost effective than
CPUs for a wide variety of datacenter applications. Yet, for latency-sensitive and bursty …

被引用次数：4 相关文章所有 2 个版本

高级搜索

QQ 群