Adaptive Layer Splitting for Wireless LLM Inference in Edge Computing: A Model-Based Reinforcement Learning Approach

Y Chen, R Li, X Yu, Z Zhao, H Zhang - arXiv preprint arXiv:2406.02616, 2024 - arxiv.org
Optimizing the deployment of large language models (LLMs) in edge computing
environments is critical for enhancing privacy and computational efficiency. Toward efficient …

Resource-aware Deployment of Dynamic DNNs over Multi-tiered Interconnected Systems

C Singhal, Y Wu, F Malandrino, M Levorato… - arXiv preprint arXiv …, 2024 - arxiv.org
The increasing pervasiveness of intelligent mobile applications requires to exploit the full
range of resources offered by the mobile-edge-cloud network for the execution of inference …

Native Support of AI Applications in 6G Mobile Networks Via an Intelligent User Plane

S Schwarzmann, TE Civelek, A Iera… - 2024 IEEE Wireless …, 2024 - ieeexplore.ieee.org
While the concept of AI4Net has been widely discussed in the past decade and adopted in
5G, its counterpart, Net4AI, has not gained that much attention so far. This is mostly due to …

Efficient Communication-Computation Tradeoff for Split Computing: A Multi-Tier Deep Reinforcement Learning Approach

Y Cao, SY Lien, CH Yeh, YC Liang… - … 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org
Splitting the computation loads of a neural network (NN) training task to multiple stations,
split computing has been the most promising technology to sustain high-accuracy model for …

[引用][C] 엣지-서버협력추론기반객체감지의성능평가

최승준, 윤석현, 최현호 - 한국통신학회인공지능학술대회논문집, 2023 - dbpia.co.kr
요 약본 논문에서는 엣지 기기의 처리 용량을 고려하여 작은 신경망 모델을 엣지에 탑재하고, 큰
신경망 모델을 서버에 탑재하여엣지와 서버 간 협력적으로 객체 감지를 수행하는 협력 추론 …

[引用][C] 협력컴퓨팅기반객체감지기법의성능최적화

최승준, 윤석현, 최현호 - 한국정보통신학회종합학술대회논문집, 2023 - dbpia.co.kr
Considering the computing resources of networks, this paper proposes a cooperative
inference system in which edge devices are equipped with small-scale neural network …