Polca: Power oversubscription in llm cloud providers

P Patel, E Choukse, C Zhang, Í Goiri, B Warrier… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent innovation in large language models (LLMs), and their myriad use-cases have
rapidly driven up the compute capacity demand for datacenter GPUs. Several cloud …

Characterizing Power Management Opportunities for LLMs in the Cloud

P Patel, E Choukse, C Zhang, Í Goiri, B Warrier… - Proceedings of the 29th …, 2024 - dl.acm.org
Recent innovation in large language models (LLMs), and their myriad use cases have
rapidly driven up the compute demand for datacenter GPUs. Several cloud providers and …

Improving GPU Energy Efficiency through an Application-transparent Frequency Scaling Policy with Performance Assurance

Y Zhang, Q Wang, Z Lin, P Xu, B Wang - Proceedings of the Nineteenth …, 2024 - dl.acm.org
Power consumption is one of the top limiting factors in high-performance computing systems
and data centers, and dynamic voltage and frequency scaling (DVFS) is an important …

Part-time Power Measurements: nvidia-smi's Lack of Attention

Z Yang, K Adamek, W Armour - arXiv preprint arXiv:2312.02741, 2023 - arxiv.org
The GPU has emerged as the go-to accelerator for high throughput and parallel workloads,
spanning scientific simulations to AI, thanks to its performance and power efficiency. Given …