作者
Jing Wu, Lin Wang, Qiangyu Pei, Xingqi Cui, Fangming Liu, Tingting Yang
发表日期
2022/8/1
期刊
IEEE Transactions on Parallel and Distributed Systems
卷号
33
期号
12
页码范围
4499-4514
出版商
IEEE
简介
Deep neural networks (DNNs) have become a critical component for inference in modern mobile applications, but the efficient provisioning of DNNs is non-trivial. Existing mobile- and server-based approaches compromise either the inference accuracy or latency. Instead, a hybrid approach can reap the benefits of the two by splitting the DNN at an appropriate layer and running the two parts separately on the mobile and the server respectively. Nevertheless, the DNN throughput in the hybrid approach has not been carefully examined, which is particularly important for edge servers where limited compute resources are shared among multiple DNNs. This article presents HiTDL, a runtime framework for managing multiple DNNs provisioned following the hybrid approach at the edge. HiTDL's mission is to improve edge resource efficiency by optimizing the combined throughput of all co-located DNNs, while still …
引用总数
学术搜索中的文章
J Wu, L Wang, Q Pei, X Cui, F Liu, T Yang - IEEE Transactions on Parallel and Distributed Systems, 2022