Optimizing inference serving on serverless platforms

A Ali, R Pinciroli, F Yan, E Smirni - Proceedings of the VLDB Endowment, 2022 - par.nsf.gov
Serverless computing is gaining popularity for machine learning (ML) serving workload due
to its autonomous resource scaling, easy to use and pay-per-use cost model. Existing …

Fedlesscan: Mitigating stragglers in serverless federated learning

M Elzohairy, M Chadha, A Jindal… - … Conference on Big …, 2022 - ieeexplore.ieee.org
Federated Learning (FL) is a machine learning paradigm that enables the training of a
shared global model across distributed clients while keeping the training data local. While …

Measuring the impact of gradient accumulation on cloud-based distributed training

Z Huang, B Jiang, T Guo, Y Liu - 2023 IEEE/ACM 23rd …, 2023 - ieeexplore.ieee.org
Gradient accumulation (GA) is a commonly adopted technique for addressing the GPU
memory shortage problem in model training. It reduces memory consumption at the cost of …

Exploring the impact of serverless computing on peer to peer training machine learning

A Barrak, R Trabelsi, F Jaafar… - 2023 IEEE International …, 2023 - ieeexplore.ieee.org
The increasing demand for computational power in big data and machine learning has
driven the development of distributed training methodologies. Among these, peer-to-peer …

[PDF][PDF] Advancing Serverless ML Training Architectures via Comparative Approach

A Barrak, R Trabelssi, F Petrillo… - … ON PARALLEL AND …, 2024 - aminebarrak.github.io
The field of distributed machine learning (ML) faces increasing demands for scalable and
cost-effective training solutions, particularly in the context of large, complex models …

SPIRT: A fault-tolerant and reliable peer-to-peer serverless ML training architecture

A Barrak, M Jaziri, R Trabelsi, F Jaafar… - 2023 IEEE 23rd …, 2023 - ieeexplore.ieee.org
The advent of serverless computing has ushered in notable advancements in distributed
machine learning, particularly within parameter server-based architectures. Yet, the …

STRATA: Random Forests going Serverless

D Tomaras, S Buschjäger, V Kalogeraki… - Proceedings of the 25th …, 2024 - dl.acm.org
Serverless computing has received growing interest in recent years for supporting large-
scale machine learning tasks. However, training a machine learning model in a serverless …

Architecting Peer-to-Peer Serverless Distributed Machine Learning Training for Improved Fault Tolerance

A Barrak, F Petrillo, F Jaafar - arXiv preprint arXiv:2302.13995, 2023 - arxiv.org
Distributed Machine Learning refers to the practice of training a model on multiple
computers or devices that can be called nodes. Additionally, serverless computing is a new …

[PDF][PDF] SERVERLESS CLOUD COMPUTING DEPLOYMENT FOR PRE-TRAINED MACHINE LEARNING MODEL

GM SURANEGARA, V FUJIYANTI… - Journal of …, 2024 - jestec.taylors.edu.my
This study examines the application of pre-trained machine learning models on serverless
cloud computing platforms, specifically comparing three serverless services offered by …

Always-On Recording Framework for Serverless Computations: Opportunities and Challenges

S Kharbanda, P Fonseca - Proceedings of the 1st Workshop on …, 2023 - dl.acm.org
Serverless computing simplifies cloud programming by managing infrastructure and
providing basic primitives for building distributed components. However, these efforts tend to …