Dependable dnn accelerator for safety-critical systems: A review on the aging perspective

I Moghaddasi, S Gorgin, JA Lee - IEEE Access, 2023 - ieeexplore.ieee.org
In the modern era, artificial intelligence (AI) and deep learning (DL) seamlessly integrate into
various spheres of our daily lives. These cutting-edge disciplines have given rise to …

Tbdb: Token bucket-based dynamic batching for resource scheduling supporting neural network inference in intelligent consumer electronics

H Gao, B Qiu, Y Wang, S Yu, Y Xu… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Consumer electronics such as mobile phones, wearable devices, and vehicle electronics
use many intelligent applications such as voice commands, machine translation, and face …

Fusionai: Decentralized training and deploying llms with massive consumer-level gpus

Z Tang, Y Wang, X He, L Zhang, X Pan, Q Wang… - arXiv preprint arXiv …, 2023 - arxiv.org
The rapid growth of memory and computation requirements of large language models
(LLMs) has outpaced the development of hardware, hindering people who lack large-scale …

iGniter: Interference-Aware GPU Resource Provisioning for Predictable DNN Inference in the Cloud

F Xu, J Xu, J Chen, L Chen, R Shang… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
GPUs are essential to accelerating the latency-sensitive deep neural network (DNN)
inference workloads in cloud datacenters. To fully utilize GPU resources, spatial sharing of …

Coordinated planning of multiple energy hubs considering the spatiotemporal load regulation of data centers

S Zhang, J Lyu, W Jin, H Cheng, C Li… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
With the digitalization of energy systems, data centers are gradually becoming an integral
part of energy hubs (EHs). Considering the spatiotemporal load regulation of data centers …

DVFO: Learning-Based DVFS for Energy-Efficient Edge-Cloud Collaborative Inference

Z Zhang, Y Zhao, H Li, C Lin… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Due to limited resources on edge and different characteristics of deep neural network (DNN)
models, it is a big challenge to optimize DNN inference performance in terms of energy …

Sigmoid Activation Implementation for Neural Networks Hardware Accelerators Based on Reconfigurable Computing Environments for Low-Power Intelligent Systems

V Shatravin, D Shashev, S Shidlovskiy - Applied Sciences, 2022 - mdpi.com
The remarkable results of applying machine learning algorithms to complex tasks are well
known. They open wide opportunities in natural language processing, image recognition …

Polca: Power oversubscription in llm cloud providers

P Patel, E Choukse, C Zhang, Í Goiri, B Warrier… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent innovation in large language models (LLMs), and their myriad use-cases have
rapidly driven up the compute capacity demand for datacenter GPUs. Several cloud …

Characterizing Power Management Opportunities for LLMs in the Cloud

P Patel, E Choukse, C Zhang, Í Goiri, B Warrier… - Proceedings of the 29th …, 2024 - dl.acm.org
Recent innovation in large language models (LLMs), and their myriad use cases have
rapidly driven up the compute demand for datacenter GPUs. Several cloud providers and …

An automated and portable method for selecting an optimal GPU frequency

G Ali, M Side, S Bhalachandra, NJ Wright… - Future Generation …, 2023 - Elsevier
Power consumption poses a significant challenge in current and emerging graphics
processing unit (GPU) enabled high-performance computing systems. In modern GPUs …