Distributed DNN Inference with Fine-grained Model Partitioning in Mobile Edge Computing Networks

H Li, X Li, Q Fan, Q He, X Wang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Model partitioning is a promising technique for improving the efficiency of distributed
inference by executing partial deep neural network (DNN) models on edge servers (ESs) or …

DNN real-time collaborative inference acceleration with mobile edge computing

R Yang, Y Li, H He, W Zhang - 2022 International Joint …, 2022 - ieeexplore.ieee.org
The collaborative inference approach splits the Deep Neural Networks (DNNs) model into
two parts. It runs collaboratively on the end device and cloud server to minimize inference …

Towards real-time cooperative deep inference over the cloud and edge end devices

S Zhang, Y Li, X Liu, S Guo, W Wang, J Wang… - Proceedings of the …, 2020 - dl.acm.org
Deep neural networks (DNNs) have been widely used in many intelligent applications such
as object recognition and automatic driving due to their superior performance in conducting …

Multi-exit DNN inference acceleration based on multi-dimensional optimization for edge intelligence

F Dong, H Wang, D Shen, Z Huang… - IEEE Transactions …, 2022 - ieeexplore.ieee.org
Edge intelligence, as a prospective paradigm for accelerating DNN inference, is mostly
implemented by model partitioning which inevitably incurs the large transmission overhead …

EdgeLD: Locally distributed deep learning inference on edge device clusters

F Xue, W Fang, W Xu, Q Wang, X Ma… - 2020 IEEE 22nd …, 2020 - ieeexplore.ieee.org
Deep Neural Networks (DNN) have been widely used in a large number of application
scenarios. However, DNN models are generally both computation-intensive and memory …

Multi-Compression Scale DNN Inference Acceleration based on Cloud-Edge-End Collaboration

H Qi, F Ren, L Wang, P Jiang, S Wan… - ACM Transactions on …, 2024 - dl.acm.org
Edge intelligence has emerged as a promising paradigm to accelerate DNN inference by
model partitioning, which is particularly useful for intelligent scenarios that demand high …

EdgeCI: Distributed Workload Assignment and Model Partitioning for CNN Inference on Edge Clusters

Y Chen, T Luo, W Fang, NN Xiong - ACM Transactions on Internet …, 2024 - dl.acm.org
Deep learning technology has grown significantly in new application scenarios such as
smart cities and driverless vehicles, but its deployment needs to consume a lot of resources …

Cooperative distributed deep neural network deployment with edge computing

CY Yang, JJ Kuo, JP Sheu… - ICC 2021-IEEE …, 2021 - ieeexplore.ieee.org
Deep Neural Networks (DNNs) are widely used to analyze the abundance of data collected
by massive Internet-of-Thing (IoT) devices. The traditional approaches usually send the data …

Computation offloading scheduling for deep neural network inference in mobile computing

Y Duan, J Wu - 2021 IEEE/ACM 29th International Symposium …, 2021 - ieeexplore.ieee.org
The quality of service (QoS) of intelligent applications on mobile devices heavily depends on
the inference speed of Deep Neural Network (DNN) models. Cooperative DNN inference …

Ultra-low-latency distributed deep neural network over hierarchical mobile networks

JI Chang, JJ Kuo, CH Lin, WT Chen… - 2019 IEEE Global …, 2019 - ieeexplore.ieee.org
Recently, the notions of partitioning the Deep Neural Network (DNN) model over the multi-
level computing units and making a fast inference with the early-inference technique have …