Tetris: Memory-efficient serverless inference through tensor sharing- 学术资源搜索

文章

学术资源搜索

Tetris: Memory-efficient serverless inference through tensor sharing

J Li, L Zhao, Y Yang, K Zhan, K Li - 2022 USENIX Annual Technical …, 2022 - usenix.org

2022 USENIX Annual Technical Conference (USENIX ATC 22), 2022•usenix.org

Executing complex, memory-intensive deep learning inference services poses a major
challenge for serverless computing frameworks, which would densely deploy and maintain
inference models at high throughput. We observe the excessive memory consumption
problem in serverless inference systems, due to the large-sized models and high data
redundancy.

Abstract

Executing complex, memory-intensive deep learning inference services poses a major challenge for serverless computing frameworks, which would densely deploy and maintain inference models at high throughput. We observe the excessive memory consumption problem in serverless inference systems, due to the large-sized models and high data redundancy.

usenix.org

展开收起

被引用次数：34 相关文章所有 2 个版本

以上显示的是最相近的搜索结果。查看全部搜索结果

高级搜索

QQ 群

Tetris: Memory-efficient serverless inference through tensor sharing

引用