作者
Yanan Yang, Laiping Zhao, Yiming Li, Huanyu Zhang, Jie Li, Mingyang Zhao, Xingzhen Chen, Keqiu Li
发表日期
2022/2/28
图书
Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems
页码范围
768-781
简介
Modern websites increasingly rely on machine learning (ML) to improve their business efficiency. Developing and maintaining ML services incurs high costs for developers. Although serverless systems are a promising solution to reduce costs, we find that the current general purpose serverless systems cannot meet the low latency, high throughput demands of ML services.
While simply ”patching” general serverless systems does not resolve the problem completely, we propose that such a system should natively combine the features of inference with a serverless paradigm. We present INFless, the first ML domain-specific serverless platform. It provides a unified, heterogeneous resource abstraction between CPU and accelerators, and achieves high throughput using built-in batching and non-uniform scaling mechanisms. It also supports low latency through coordinated management of batch queuing time …
引用总数
学术搜索中的文章
Y Yang, L Zhao, Y Li, H Zhang, J Li, M Zhao, X Chen… - Proceedings of the 27th ACM International Conference …, 2022