Distributed artificial intelligence empowered by end-edge-cloud computing: A survey

S Duan, D Wang, J Ren, F Lyu, Y Zhang… - … Surveys & Tutorials, 2022 - ieeexplore.ieee.org
As the computing paradigm shifts from cloud computing to end-edge-cloud computing, it
also supports artificial intelligence evolving from a centralized manner to a distributed one …

Flexible high-resolution object detection on edge devices with tunable latency

S Jiang, Z Lin, Y Li, Y Shu, Y Liu - Proceedings of the 27th Annual …, 2021 - dl.acm.org
Object detection is a fundamental building block of video analytics applications. While
Neural Networks (NNs)-based object detection models have shown excellent accuracy on …

[PDF][PDF] CoDL: efficient CPU-GPU co-execution for deep learning inference on mobile devices.

F Jia, D Zhang, T Cao, S Jiang, Y Liu, J Ren, Y Zhang - MobiSys, 2022 - chrisplus.me
Concurrent inference execution on heterogeneous processors is critical to improve the
performance of increasingly heavy deep learning (DL) models. However, available …

Romou: Rapidly generate high-performance tensor kernels for mobile gpus

R Liang, T Cao, J Wen, M Wang, Y Wang… - Proceedings of the 28th …, 2022 - dl.acm.org
Mobile GPU, as a ubiquitous and powerful accelerator, plays an important role in
accelerating on-device DNN (Deep Neural Network) inference. The frequent-upgrade and …

Nn-stretch: Automatic neural network branching for parallel inference on heterogeneous multi-processors

J Wei, T Cao, S Cao, S Jiang, S Fu, M Yang… - Proceedings of the 21st …, 2023 - dl.acm.org
Mobile devices are increasingly equipped with heterogeneous multiprocessors, eg, CPU+
GPU+ DSP. Yet existing Neural Network (NN) inference fails to fully utilize the computing …

ParallelFusion: towards maximum utilization of mobile GPU for DNN inference

J Lee, Y Liu, Y Lee - Proceedings of the 5th International Workshop on …, 2021 - dl.acm.org
Mobile GPUs are extremely under-utilized for DNN computations across different mobile
deep learning frameworks and multiple DNNs with various complexities. We explore the …

GCD2: A Globally Optimizing Compiler for Mapping DNNs to Mobile DSPs

W Niu, J Guan, X Shen, Y Wang… - 2022 55th IEEE/ACM …, 2022 - ieeexplore.ieee.org
More specialized chips are exploiting available high transistor density to expose parallelism
at a large scale with more intricate instruction sets. This paper reports on a compilation …

SwapNet: Efficient Swapping for DNN Inference on Edge AI Devices Beyond the Memory Budget

K Wang, J Cao, Z Zhou, Z Li - IEEE Transactions on Mobile …, 2024 - ieeexplore.ieee.org
Executing deep neural networks (DNNs) on edge artificial intelligence (AI) devices enables
various autonomous mobile computing applications. However, the memory budget of edge …

基于嵌入式CPU+ GPU 异构平台的遥感图像滤波加速

谭鹏源, 薛长斌, 周莉 - 空间科学学报, 2024 - cjss.ac.cn
针对遥感图像在轨实时处理提出一种基于嵌入式CPU+ GPU 异构平台的遥感图像滤波加速设计
方法. 以加速拉普拉斯滤波为例, 利用GPU 的并行计算特点, 通过数据划分及数据映射的方法对 …

CoCV: Heterogeneous Processors Collaboration Mechanism for End-to-End Execution of Intelligent Computer Vision Tasks on Mobile Devices

Y Wan, M Liu, G Li, F Dong - 2023 IEEE 29th International …, 2023 - ieeexplore.ieee.org
Object detection, image classification, and various other computer vision tasks have become
prevalent on mobile devices. These computer vision tasks are typically executed with three …