作者
Ali Jahanshahi, Hadi Zamani Sabzi, Chester Lau, Daniel Wong
发表日期
2020/9/14
期刊
IEEE Computer Architecture Letters
卷号
19
期号
2
页码范围
139-142
出版商
IEEE
简介
Cloud inference systems have recently emerged as a solution to the ever-increasing integration of AI-powered applications into the smart devices around us. The wide adoption of GPUs in cloud inference systems has made power consumption a first-order constraint in multi-GPU systems. Thus, to achieve this goal, it is critical to have better insight into the power and performance behaviors of multi-GPU inference system. To this end, we propose GPU-NEST, an energy efficiency characterization methodology for multi-GPU inference systems. As case studies, we examined the challenges presented by, and implications of, multi-GPU scaling, inference scheduling, and non-GPU bottleneck on multi-GPU inference systems' energy efficiency. We found that inference scheduling in particular has great benefits in improving the energy efficiency of multi-GPU scheduling, by as much as 40 percent.
引用总数
学术搜索中的文章
A Jahanshahi, HZ Sabzi, C Lau, D Wong - IEEE Computer Architecture Letters, 2020