Coordinated batching and DVFS for DNN inference on GPU accelerators

I Moghaddasi, S Gorgin, JA Lee - IEEE Access, 2023 - ieeexplore.ieee.org

In the modern era, artificial intelligence (AI) and deep learning (DL) seamlessly integrate into
various spheres of our daily lives. These cutting-edge disciplines have given rise to …

被引用次数：4 相关文章所有 2 个版本

Tbdb: Token bucket-based dynamic batching for resource scheduling supporting neural network inference in intelligent consumer electronics

H Gao, B Qiu, Y Wang, S Yu, Y Xu… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Consumer electronics such as mobile phones, wearable devices, and vehicle electronics
use many intelligent applications such as voice commands, machine translation, and face …

被引用次数：15 相关文章所有 2 个版本

[PDF] arxiv.org

Fusionai: Decentralized training and deploying llms with massive consumer-level gpus

Z Tang, Y Wang, X He, L Zhang, X Pan, Q Wang… - arXiv preprint arXiv …, 2023 - arxiv.org

The rapid growth of memory and computation requirements of large language models
(LLMs) has outpaced the development of hardware, hindering people who lack large-scale …

被引用次数：9 相关文章所有 2 个版本

[PDF] arxiv.org

iGniter: Interference-Aware GPU Resource Provisioning for Predictable DNN Inference in the Cloud

F Xu, J Xu, J Chen, L Chen, R Shang… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

GPUs are essential to accelerating the latency-sensitive deep neural network (DNN)
inference workloads in cloud datacenters. To fully utilize GPU resources, spatial sharing of …

被引用次数：15 相关文章所有 5 个版本

Coordinated planning of multiple energy hubs considering the spatiotemporal load regulation of data centers

S Zhang, J Lyu, W Jin, H Cheng, C Li… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

With the digitalization of energy systems, data centers are gradually becoming an integral
part of energy hubs (EHs). Considering the spatiotemporal load regulation of data centers …

被引用次数：3 相关文章所有 2 个版本

DVFO: Learning-Based DVFS for Energy-Efficient Edge-Cloud Collaborative Inference

Z Zhang, Y Zhao, H Li, C Lin… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Due to limited resources on edge and different characteristics of deep neural network (DNN)
models, it is a big challenge to optimize DNN inference performance in terms of energy …

被引用次数：4 相关文章所有 2 个版本

[PDF] mdpi.com

Sigmoid Activation Implementation for Neural Networks Hardware Accelerators Based on Reconfigurable Computing Environments for Low-Power Intelligent Systems

V Shatravin, D Shashev, S Shidlovskiy - Applied Sciences, 2022 - mdpi.com

The remarkable results of applying machine learning algorithms to complex tasks are well
known. They open wide opportunities in natural language processing, image recognition …

被引用次数：11 相关文章所有 8 个版本

[PDF] arxiv.org

Polca: Power oversubscription in llm cloud providers

P Patel, E Choukse, C Zhang, Í Goiri, B Warrier… - arXiv preprint arXiv …, 2023 - arxiv.org

Recent innovation in large language models (LLMs), and their myriad use-cases have
rapidly driven up the compute capacity demand for datacenter GPUs. Several cloud …

被引用次数：4 相关文章所有 2 个版本

[PDF] acm.org

Characterizing Power Management Opportunities for LLMs in the Cloud

P Patel, E Choukse, C Zhang, Í Goiri, B Warrier… - Proceedings of the 29th …, 2024 - dl.acm.org

Recent innovation in large language models (LLMs), and their myriad use cases have
rapidly driven up the compute demand for datacenter GPUs. Several cloud providers and …

被引用次数：4 相关文章所有 4 个版本

An automated and portable method for selecting an optimal GPU frequency

G Ali, M Side, S Bhalachandra, NJ Wright… - Future Generation …, 2023 - Elsevier

Power consumption poses a significant challenge in current and emerging graphics
processing unit (GPU) enabled high-performance computing systems. In modern GPUs …

被引用次数：4 相关文章所有 3 个版本

高级搜索

QQ 群