Mosaic: Heterogeneity-, communication-, and constraint-aware model slicing and execution...

JS Jeong, J Lee, D Kim, C Jeon, C Jeong… - Proceedings of the 20th …, 2022 - dl.acm.org

The rapid development of deep learning algorithms, as well as innovative hardware
advancements, encourages multi-DNN workloads such as augmented reality applications …

被引用次数：70 相关文章所有 2 个版本

[PDF] acm.org

Autofl: Enabling heterogeneity-aware energy efficient federated learning

YG Kim, CJ Wu - MICRO-54: 54th Annual IEEE/ACM International …, 2021 - dl.acm.org

Federated learning enables a cluster of decentralized mobile devices at the edge to
collaboratively train a shared machine learning model, while keeping all the raw training …

被引用次数：93 相关文章所有 5 个版本

[PDF] nsf.gov

Autoscale: Energy efficiency optimization for stochastic edge inference using reinforcement learning

YG Kim, CJ Wu - 2020 53rd Annual IEEE/ACM international …, 2020 - ieeexplore.ieee.org

Deep learning inference is increasingly run at the edge. As the programming and system
stack support becomes mature, it enables acceleration opportunities in a mobile system …

被引用次数：85 相关文章所有 5 个版本

[PDF] arxiv.org

Mandheling: Mixed-precision on-device dnn training with dsp offloading

D Xu, M Xu, Q Wang, S Wang, Y Ma, K Huang… - Proceedings of the 28th …, 2022 - dl.acm.org

This paper proposes Mandheling, the first system that enables highly resource-efficient on-
device training by orchestrating mixed-precision training with on-chip Digital Signal …

被引用次数：52 相关文章所有 6 个版本

[PDF] arxiv.org

Empowering 1000 tokens/second on-device llm prefilling with mllm-npu

D Xu, H Zhang, L Yang, R Liu, G Huang, M Xu… - arXiv preprint arXiv …, 2024 - arxiv.org

On-device large language models (LLMs) are catalyzing novel mobile applications such as
UI task automation and personalized email auto-reply, without giving away users' private …

被引用次数：13 相关文章所有 2 个版本

[PDF] arxiv.org

Fusionai: Decentralized training and deploying llms with massive consumer-level gpus

Z Tang, Y Wang, X He, L Zhang, X Pan, Q Wang… - arXiv preprint arXiv …, 2023 - arxiv.org

The rapid growth of memory and computation requirements of large language models
(LLMs) has outpaced the development of hardware, hindering people who lack large-scale …

被引用次数：26 相关文章所有 2 个版本

[PDF] google.com

Blastnet: Exploiting duo-blocks for cross-processor real-time dnn inference

N Ling, X Huang, Z Zhao, N Guan, Z Yan… - Proceedings of the 20th …, 2022 - dl.acm.org

In recent years, Deep Neural Network (DNN) has been increasingly adopted by a wide
range of time-critical applications running on edge platforms with heterogeneous …

被引用次数：27 相关文章所有 3 个版本

[PDF] acm.org

Adaptivenet: Post-deployment neural architecture adaptation for diverse edge environments

H Wen, Y Li, Z Zhang, S Jiang, X Ye, Y Ouyang… - Proceedings of the 29th …, 2023 - dl.acm.org

Deep learning models are increasingly deployed to edge devices for real-time applications.
To ensure stable service quality across diverse edge environments, it is highly desirable to …

被引用次数：24 相关文章所有 5 个版本

Efficient knowledge management for heterogeneous federated continual learning on resource-constrained edge devices

Z Yang, S Zhang, C Li, M Wang, H Wang… - Future Generation …, 2024 - Elsevier

Federated learning (FL) is a promising and privacy-preserving distributed learning method
that is widely deployed on edge devices. However, in practical applications, the data …

被引用次数：7 相关文章

[PDF] acm.org Full View

SLO-aware inference scheduler for heterogeneous processors in edge platforms

W Seo, S Cha, Y Kim, J Huh, J Park - ACM Transactions on Architecture …, 2021 - dl.acm.org

With the proliferation of applications with machine learning (ML), the importance of edge
platforms has been growing to process streaming sensor, data locally without resorting to …

被引用次数：38 相关文章所有 4 个版本

高级搜索

QQ 群