Serving machine learning workloads in resource constrained environments: A serverless deployment example

A Christidis, R Davies… - 2019 IEEE 12th …, 2019 - ieeexplore.ieee.org
Deployed AI platforms typically ship with bulky system architectures which present
bottlenecks and a high risk of failure. A serverless deployment can mitigate these factors and …

Enabling serverless deployment of large-scale ai workloads

A Christidis, S Moschoyiannis, CH Hsu… - IEEE Access, 2020 - ieeexplore.ieee.org
We propose a set of optimization techniques for transforming a generic AI codebase so that
it can be successfully deployed to a restricted serverless environment, without compromising …

A deep reinforcement learning based algorithm for time and cost optimized scaling of serverless applications

A Mampage, S Karunasekera, R Buyya - arXiv preprint arXiv:2308.11209, 2023 - arxiv.org
Serverless computing has gained a strong traction in the cloud computing community in
recent years. Among the many benefits of this novel computing model, the rapid auto-scaling …

Toward sustainable serverless computing

P Patros, J Spillner, AV Papadopoulos… - IEEE Internet …, 2021 - ieeexplore.ieee.org
Although serverless computing generally involves executing short-lived “functions,” the
increasing migration to this computing paradigm requires careful consideration of energy …

Mlproxy: Sla-aware reverse proxy for machine learning inference serving on serverless computing platforms

N Mahmoudi, H Khazaei - arXiv preprint arXiv:2202.11243, 2022 - arxiv.org
Serving machine learning inference workloads on the cloud is still a challenging task on the
production level. Optimal configuration of the inference workload to meet SLA requirements …

Serverless data science-are we there yet? a case study of model serving

Y Wu, TTA Dinh, G Hu, M Zhang, YM Chee… - Proceedings of the 2022 …, 2022 - dl.acm.org
Machine learning (ML) is an important part of modern data science applications. Data
scientists today have to manage the end-to-end ML life cycle that includes both model …

Amps-inf: Automatic model partitioning for serverless inference with cost efficiency

J Jarachanthan, L Chen, F Xu, B Li - Proceedings of the 50th …, 2021 - dl.acm.org
The salient pay-per-use nature of serverless computing has driven its continuous
penetration as an alternative computing paradigm for various workloads. Yet, challenges …

Deep reinforcement learning for application scheduling in resource-constrained, multi-tenant serverless computing environments

A Mampage, S Karunasekera, R Buyya - Future Generation Computer …, 2023 - Elsevier
Serverless computing has sparked a massive interest in both the cloud service providers
and their clientele in recent years. This model entails the shift of the entire matter of resource …

Performance evaluation of data-centric workloads in serverless environments

AM Nestorov, J Polo, C Misale… - 2021 IEEE 14th …, 2021 - ieeexplore.ieee.org
Serverless computing is a cloud-based execution paradigm that allows provisioning
resources on-demand, freeing developers from infrastructure management and operational …

Improving application migration to serverless computing platforms: Latency mitigation with keep-alive workloads

W Lloyd, M Vu, B Zhang, O David… - 2018 IEEE/ACM …, 2018 - ieeexplore.ieee.org
Serverless computing platforms provide Function (s)-as-a-Service (FaaS) to end users while
promising reduced hosting costs, high availability, fault tolerance, and dynamic elasticity for …