Performance optimization of machine learning inference under latency and server power constraints

G Chen, X Wang - 2022 IEEE 42nd International Conference on …, 2022 - ieeexplore.ieee.org
Power capping is an important technique for high-density servers to safely oversubscribe the
power infrastructure in a data center. However, power capping is commonly accomplished …

OptimML: Joint Control of Inference Latency and Server Power Consumption for ML Performance Optimization

G Chen, X Wang - ACM Transactions on Autonomous and Adaptive …, 2024 - dl.acm.org
Power capping is an important technique for high-density servers to safely oversubscribe the
power infrastructure in a data center. However, power capping is commonly accomplished …

A reinforcement learning approach for performance-aware reduction in power consumption of data center compute nodes

A Raj, S Perarnau, A Gokhale - 2023 IEEE International …, 2023 - ieeexplore.ieee.org
As Exascale computing becomes a reality, the energy needs of compute nodes in cloud data
centers will continue to grow. A common approach to reducing this energy demand is to limit …

ALPACA: Application performance aware server power capping

J Krzywda, A Ali-Eldin, E Wadbro… - 2018 IEEE …, 2018 - ieeexplore.ieee.org
Server power capping limits the power consumption of a server to not exceed a specific
power budget. This allows data center operators to reduce the peak power consumption at …

Pack & cap: adaptive dvfs and thread packing under power caps

R Cochran, C Hankendi, AK Coskun… - Proceedings of the 44th …, 2011 - dl.acm.org
The ability to cap peak power consumption is a desirable feature in modern data centers for
energy budgeting, cost management, and efficient power delivery. Dynamic voltage and …

A scalable priority-aware approach to managing data center server power

Y Li, CR Lefurgy, K Rajamani… - … Symposium on High …, 2019 - ieeexplore.ieee.org
Power management is a key component of modern data center design. Power managers
must (1) ensure the costand energy-efficient utilization of the data center infrastructure,(2) …

Network packet processing mode-aware power management for data center servers

KD Kang, G Park, NS Kim, D Kim - IEEE Computer Architecture …, 2019 - ieeexplore.ieee.org
In data center servers, power management (PM) exploiting Dynamic Voltage and Frequency
Scaling (DVFS) for processors can play a crucial role to improve energy efficiency. However …

Nmap: Power management based on network packet processing mode transition for latency-critical workloads

KD Kang, G Park, H Kim, M Alian, NS Kim… - MICRO-54: 54th Annual …, 2021 - dl.acm.org
Processor power management exploiting Dynamic Voltage and Frequency Scaling (DVFS)
plays a crucial role in improving the data-center's energy efficiency. However, we observe …

DeepPower: Deep Reinforcement Learning based Power Management for Latency Critical Applications in Multi-core Systems

J Zhang, G Yu, Z He, L Ai, P Chen - Proceedings of the 52nd …, 2023 - dl.acm.org
Latency-critical (LC) applications are widely deployed in modern datacenters. Effective
power management for LC applications can yield significant cost savings. However, it poses …

Adaptive power management through thermal aware workload balancing in internet data centers

J Yao, H Guan, J Luo, L Rao… - IEEE Transactions on …, 2014 - ieeexplore.ieee.org
The past decade witnessed the tremendous growth of online services and applications.
Together with the increase of cloud computing, more and more computation are hosted by …