Performance optimization of machine learning inference under latency and server power constraints

G Chen, X Wang - 2022 IEEE 42nd International Conference on …, 2022 - ieeexplore.ieee.org
Power capping is an important technique for high-density servers to safely oversubscribe the
power infrastructure in a data center. However, power capping is commonly accomplished …

Performance Optimization of Machine Learning Inference under Latency and Server Power Constraints

G Chen, X Wang - 2022 IEEE 42nd International Conference on …, 2022 - computer.org
Power capping is an important technique for high-density servers to safely oversubscribe the
power infrastructure in a data center. However, power capping is commonly accomplished …