Scalable deep learning on distributed infrastructures: Challenges, techniques, and tools

R Mayer, HA Jacobsen - ACM Computing Surveys (CSUR), 2020 - dl.acm.org
Deep Learning (DL) has had an immense success in the recent past, leading to state-of-the-
art results in various domains, such as image recognition and natural language processing …

Offloading machine learning to programmable data planes: A systematic survey

R Parizotto, BL Coelho, DC Nunes, I Haque… - ACM Computing …, 2023 - dl.acm.org
The demand for machine learning (ML) has increased significantly in recent decades,
enabling several applications, such as speech recognition, computer vision, and …

Chasing carbon: The elusive environmental footprint of computing

U Gupta, YG Kim, S Lee, J Tse, HHS Lee… - … Symposium on High …, 2021 - ieeexplore.ieee.org
Given recent algorithm, software, and hardware innovation, computing has enabled a
plethora of new applications. As computing becomes increasingly ubiquitous, however, so …

{INFaaS}: Automated model-less inference serving

F Romero, Q Li, NJ Yadwadkar… - 2021 USENIX Annual …, 2021 - usenix.org
Despite existing work in machine learning inference serving, ease-of-use and cost efficiency
remain challenges at large scales. Developers must manually search through thousands of …

{MArk}: Exploiting cloud services for {Cost-Effective},{SLO-Aware} machine learning inference serving

C Zhang, M Yu, W Wang, F Yan - 2019 USENIX Annual Technical …, 2019 - usenix.org
The advances of Machine Learning (ML) have sparked a growing demand of ML-as-a-
Service: developers train ML models and publish them in the cloud as online services to …

Batch: Machine learning inference serving on serverless platforms with adaptive batching

A Ali, R Pinciroli, F Yan, E Smirni - … International Conference for …, 2020 - ieeexplore.ieee.org
Serverless computing is a new pay-per-use cloud service paradigm that automates resource
scaling for stateless functions and can potentially facilitate bursty machine learning serving …

Atoll: A scalable low-latency serverless platform

A Singhvi, A Balasubramanian, K Houck… - Proceedings of the …, 2021 - dl.acm.org
With user-facing apps adopting serverless computing, good latency performance of
serverless platforms has become a strong fundamental requirement. However, it is difficult to …

INFless: a native serverless system for low-latency, high-throughput inference

Y Yang, L Zhao, Y Li, H Zhang, J Li, M Zhao… - Proceedings of the 27th …, 2022 - dl.acm.org
Modern websites increasingly rely on machine learning (ML) to improve their business
efficiency. Developing and maintaining ML services incurs high costs for developers …

Kraken: Adaptive container provisioning for deploying dynamic dags in serverless platforms

VM Bhasi, JR Gunasekaran, P Thinakaran… - Proceedings of the …, 2021 - dl.acm.org
The growing popularity of microservices has led to the proliferation of online cloud service-
based applications, which are typically modelled as Directed Acyclic Graphs (DAGs) …

Cocktail: A multidimensional optimization for model serving in cloud

JR Gunasekaran, CS Mishra, P Thinakaran… - … USENIX Symposium on …, 2022 - usenix.org
With a growing demand for adopting ML models for a variety of application services, it is vital
that the frameworks serving these models are capable of delivering highly accurate …