Multi-agent collaborative inference via dnn decoupling: Intermediate feature compression and edge learning

Z Hao, G Xu, Y Luo, H Hu, J An… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Recently, deploying deep neural network (DNN) models via collaborative inference, which
splits a pre-trained model into two parts and executes them on user equipment (UE) and …

Improving device-edge cooperative inference of deep learning via 2-step pruning

W Shi, Y Hou, S Zhou, Z Niu, Y Zhang… - IEEE INFOCOM 2019 …, 2019 - ieeexplore.ieee.org
Deep neural networks (DNNs) are state-of-the-art solutions for many machine learning
applications, and have been widely used on mobile devices. Running DNNs on …

A Fine-Grained End-to-End Latency Optimization Framework for Wireless Collaborative Inference

L Mu, Z Li, W Xiao, R Zhang, P Wang… - IEEE Internet of …, 2023 - ieeexplore.ieee.org
Mobile devices are becoming increasingly capable of delivering intelligent services by
leveraging deep learning architectures such as deep neural networks (DNNs). However …

Energy-efficient model compression and splitting for collaborative inference over time-varying channels

M Krouka, A Elgabli, CB Issaid… - 2021 IEEE 32nd Annual …, 2021 - ieeexplore.ieee.org
Today's intelligent applications can achieve high performance accuracy using machine
learning (ML) techniques, such as deep neural networks (DNNs). Traditionally, in a remote …

Collaborative inference via ensembles on the edge

N Shlezinger, E Farhan… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
The success of deep neural networks (DNNs) as an enabler of artificial intelligence (AI) is
heavily dependent on high computational resources. The increasing demands for …

Distributed DNN inference with fine-grained model partitioning in mobile edge computing networks

H Li, X Li, Q Fan, Q He, X Wang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Model partitioning is a promising technique for improving the efficiency of distributed
inference by executing partial deep neural network (DNN) models on edge servers (ESs) or …

Pico: Pipeline inference framework for versatile cnns on diverse mobile devices

X Yang, Z Xu, Q Qi, J Wang, H Sun… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Distributing the inference of convolutional neural network (CNN) to multiple mobile devices
has been studied in recent years to achieve real-time inference without losing accuracy …

Multi-compression scale DNN inference acceleration based on cloud-edge-end collaboration

H Qi, F Ren, L Wang, P Jiang, S Wan… - ACM Transactions on …, 2024 - dl.acm.org
Edge intelligence has emerged as a promising paradigm to accelerate DNN inference by
model partitioning, which is particularly useful for intelligent scenarios that demand high …

Optimizing job offloading schedule for collaborative DNN inference

Y Duan, J Wu - IEEE Transactions on Mobile Computing, 2023 - ieeexplore.ieee.org
Deep Neural Networks (DNNs) have been widely deployed in mobile applications. DNN
inference latency is a critical metric to measure the service quality of those applications …

Decentralized low-latency collaborative inference via ensembles on the edge

M Malka, E Farhan, H Morgenstern… - arXiv preprint arXiv …, 2022 - arxiv.org
The success of deep neural networks (DNNs) is heavily dependent on computational
resources. While DNNs are often employed on cloud servers, there is a growing need to …